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AC AC C AAT C C AAAAG C GT GG AACT AT GT T AAAAAG C T AC AAC AT AAT AT T 
AAT G CT AT C AAAT CT T C T AG CT C T T C AG AAG T T TAT C AAT C AGT T G C AG A 
AGGAAAAAT GAT T GT GG G GT T GAC T TAG G AAGAC C CT AGT G T C AAT T T G C 
AAAAAAGT G GT G C C AAT G T T T C TAT T GT AT AT CCG AC AGAAG GG AC AGT T 
TTTGTCCCATCTTCGGTTGCAATTATAAAGAATGCTCCTTCTATGAAAGA 
AGCAAAGTTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCT 
T T GGG C AGT C AAC GAGT AAC C G AC CT AT T C GT AAAG AT GC C C AAAC GAG T 
AATGGCATGAAAGCTTTAAAGGATATTGCTACTCTTAAAGAAGATTATCG 
CTATGTCACTAAGCATAAGGGCCAAATCCTTAAAACCTATAATCGTATTC 
GTAGAAATGCTGAT 

SEQ ID NO. 6004 

STRAIN H3 6B 

T AAAC T AC T T C C AC C AAAAG AAT T AGT TAT T CT AAGT C C AAAT AG T C AAG 
C CAT T T T AAC AG G AAC GAT T C C AG C T T T T GAG GAAAAAT AC GG TAT AAAA 
GT T AAG C T TAT T C AAG G T G GG AC AGGT C AACT AAT AG AT AGAT T AAGT AA 
G G AGGGT AAGC AGT T G AAG G C GG AT AT T T T C T T T GG AG G AAAT TAT ACG C 
AAT T T G AAAG T C AT AAGG C AT T GT T T GAGT CT T AC GT AT C AAAG AAT AT T 
CAT ACT GT T ATT C C AGAT TAT AT C CAT C CGAGT G AT ACG G C GAC AC C T T A 
T AC TAT AAAT GG G AG T GT CT T GAT T GT AAAT AAc G AAT T AGT T AAGG GAC 
T T AC C AT C AAG AGT TAT G AAGAT T TAT T AC AG C C T T C C T T AAAAGGT AAA 
ATTGCCTTTGCAGATCCGAATACTTCCTcTAGTGCTTTCTCACAACTCAC 
T AAT AT ACT CT T GG C C AAG G G T G G T T AC AC C AAT C C AAAAGC GT G GAAC T 
AT GTT AAAAAGCTACAACAT AAT AT T AAT GCT AT CAAAT CT T CT AGCT CT 
T C AG AAG T T TAT C AAT C AG T T G C AG AAGG AAAAAT GAT TGTGGGGTT GAC 
TTACGAAGACCCTAGTGTCAATTTGCAAAAAAGTGGTGCCAATGTTTCTA 
TTGTATATCCGACAGAAGGGACAGTTTTTGTCCCATCTTCGGTTGCAATT 
AT AAAGAATGCT C CTT CT AT GAAAGAAGC AAAGTT AT TT AT T AAT T T T AT 
GCTTTCTTT AGAT GTTCAAAATGCCTTTGGGC AGT CAACGAGTAACCGAC 
CT AT T CGTAAAGATGC CC AAAC GAGT AAT GGCAT GAAAGCTTTAAAGGAT 
AT T G C T AC T CT T AAAG AAG AT T AT CGC T AT GT C AC T AAG C AT AAGGG C C A 
AAT C CT T AAAAC C TAT AAT C GT AT T C GT AG AAAT GCT GAT 

SEQ ID NO. 6005 

STRAIN 18RS21 

C AGC C T T C T AAAC T AC T T C C AC C AAAAG AAT TAG T T AT T C T AAGT C C AAA 
T AGT C AAG C CAT T T T AAC AGG AACG AT T C C AG C T T T T GAG GAAAAAT AC G 
GT AT AAAAGT T AAG C T T AT T C AAGG TG GG AC AG GG C AAC T AAT AG AT AG A 
T T AAGT AAGG AG GGT AAG C AGT TG AAG G C G g AT AT TTTCTTTG GAG G AAA 
T TAT ACG C AAT T T G AAAG T CAT AAGG CAT T G T T T GAG T C T T AC G T AT C AA 
AG AAT GT T CAT AC T G T TAT T C C AG AC TAT AT C CAT C C AAGT GAT AC G G C G 
AC AC CT TAT ACT AT AAAT GG G AGT GT CTT GAT T GT AAAT AAC G AAT TAG C 
T AAG GG AC T T AC CAT C AAG AGT TAT G AAG AT T TAT T ACAGC CTT CCT TAA 
AAGGTAAAATTGCCTTTGCAGATCCGAATACTTCCTCTAGTGCTTTCTCA 
C AAC T C AC T AAT AT AC T CT T GG C C AAG G G T G G T TAG AC C AAT C C AAAAGC 
GT G GAAC TAT GT T AAAAAGC T AC AAC AT AAT AT T AAT G C T AT CAAAT CTT 
CTAGCTCTTCAGAAGTTTATCAATCAGTTGCAGAAGGAAAAATGATTGTG 
GGGCTGACTTACGAAGACCCTAGTGTCAATTTGCAAAAAAGTGGTGCCAA 
T GT T T CT AT T G TAT AT C C GAC AG AAGGG AC AGT TTTTGTCC CAT C T T CGG 
TTGCAATTATAAAGAATGCTCCTTCTATGAAAGAAGCAAAGTTATTTATT 
AATTTTATGCTTTCTTTAGATGTTCAAAATGCCTTTGGGCAGTCAACGAG 
T AAC CG AC C TAT T C G T AAAG AT G C C C AAAC GAG T AAT GG C AT G AAAG CTT 
T AAAGGAT AT T G C TACT C T T AAAG AAG AT T AT CGC TAT G T C ACT AAG CAT 
AAGGGCCAAAT C CTT AAAAC CT AT AAT CGT ATT CGTAGAAAT GCT GAT 

SEQ ID NO. 6006 

STRAIN M732 

C AG C C T T CT AAACT AC T T C C AC C AAAAG AAT TAG T 

T ATT CT AAGT C CAAAT AGT CAAGCC AT T TT AAC AGGAACGATT C CAGCTT 
TTGAGG AAAAAT ACGGT AT AAAAGT T AAG CT TAT T CAAGGT GGG AC AGGG 
C AAC T AAT AG AT AG AT T AAGT AAG GAG G GT AAG C AGT T G AAGG CG GAT AT 
TTTCTTTGGAGGAAATTATACGCAATTTGAAAGTCATAAGGCATTGTTTG 
AGT CTT ACG TAT C AAAG AAT GT T CAT ACT GTT ATT CC AG ACT AT AT CC AT 
CCGAGTGATACGGCGACACCTTATACTATAAATGGGAGTGTCTTGATTGT 
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AAATAACGAATTAGCTAAGGGACTTACCATCAAGAGTTATGAAGATTTAT 
TACAGCCTTCCTTAAAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCC 
T C T AGT G CT T T CT C AC AACT C ACT AAT AT ACT CT T GG C C AAG GGT GGT T A 
CACCAAT C CAAAAGCGTGG AACT ATGTT AAAAAGCTACAACAT AATAT TA 
ATGCTATCAAATCTTCTAGCTCTTCAGAAGTTTATCAATCAGTTGCAGAA 
GGAAAAATGATTGTGGGGTTGACTTACGAAGACCCTAGTGTCAATTTGCA 
AAAAAGTGGTGCCAATGTTTCTATTGTATACCCGACAGAAGGGACAGTTT 
TTGTCCCATCTTCGGTTGCAATTATAAAGAATGCTCCTTCTATGAAAGAA 
GCAAAGTTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTT 
T GG G C AGT C AAC G AGT AAC C G AC C TAT T C GT AAAG AT G C C C AAAC AAGT A 
ATGGCATGAAAGCTTTAAAGGATATCGCTACTCTTAAAGAAGATTATCGC 
TATGTCACTAAGCATAAGAGCCAAATCCTTAAAACCTATAATCGCATTCG 
T AG AAAT G C T GAT 

SEQ ID NO. 6007 

STRAIN COH1 

CAGCCTTCTAAACTACTTCCACCAAAAGAATTAGTT 

AT T CT AAGT C C AAAT AGT C AAG C CAT T T T AAC AGG AAC GAT T C C AG CT T T 
T GAG G AAAAAT AC GGT AT AAAAG T T AAGC T T AT T C AAG G T G G G AC AG G G C 
AACTAATAGATAGATTAAGTAAGGAGGGTAAGCAGTTGAAGGCGGATATT 
TTCTTTGGAGGAAATTATACGCAATTTGAAAGTCATAAGGCATTGTTTGA 
GT CT T ACGT AT C AAAG AAT GT T CAT ACT GT TAT T C C AG AC TAT AT C CAT C 
CGAGTGATACGGCGACACCTTATACTATAAATGGGAGTGTCTTGATTGTA 
AATAACGAATTAGCTAAGGGACTTACCATCAAGAGTTATGAAGATTTATT 
ACAGCCTTCCTTAAAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCCT 
C T AGT G C T T T C T C AC AAC T C ACT AAT AT AC T C T T GG C C AAGGGT GGT T AC 
AC C AAT C C AAAAG CGT G GAAC T AT GT T AAAAAG CT AC AAC AT AAT AT T AA 
TGCTATCAAATCTTCTAGCTCTTCAGAAGTTTATCAATCAGTTGCAGAAG 
G AAAAAT GAT TGTGGGGTT G ACT T AC G AAGAC CCT AG T GT C AAT TT G C AA 
AAAAGTGGTGCCAATGTTTCTATTGTATACCCGACAGAAGGGACAGTTTT 
TGTCCCATCTTCGGTTGCAATTATAAAGAATGCTCCTTCTATGAAAGAAG 
CAAAGTTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTTT 
G GG C AGT C AAC G AG T AAC C GAC C T AT T C GT AAAG ATGC CC AAAC AAGT AA 
T G G CAT G AAAG CT T T AAAG GAT AT C G C T ACT C T T AAAGAAG AT TAT C GC T 
AT G T C AC T AAG CAT AAG AG C C AAAT C CT T AAAAC C TAT AAT C GC AT T C G T 
AGAAAT G C T GAT 

SEQ ID NO. 6008 

STRAIN M7 81 

C AG C C T T C T AAAC T ACT T C C AC C AAAAG AAT T AGT TAT T 

CTAAGTCCAAATAGTCAAGCCATTTTAACAGGAACGATTCCAGCTTTTGA 

GGAAAAATACGGTATAAAAGTTAAGCTTATTCAAGGTGGGACAGGGCAAC 

TAATAGATAGATTAAGTAAGGAGGGTAAGCAGTTGAAGGCGGATATTTTC 

T T T G G AGG AAAT T AT ACG C AAT T T GAAAGT C AT AAGG C AT T GT T T G AGT C 

TTACGTATCAAAGAATGTTCATACTGTTATTCCAGACTATATCCATCCGA 

GT GAT ACG G C GAC AC C T TAT AC T AT AAAT G GG AGT G T C T T GAT T G T AAAT 

AAC GAAT T AG C T AAGG GAC T T AC CAT C AAG AG T TAT G AAG AT T TAT T AC A 

GCCTTCCTTAAAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCCTCTA 

GTGCTTTCTCACAACTCACTAATATACTCTTGGCCAAGGGTGGTTACACC 

AATCCAAAAGCGTGGAACTATGTTAAAAAGCTACAACATAATATTAATGC 

TATCAAATCTTCTAGCTCTTCAGAAGTTTATCAATCAGTTGCAGAAGGAA 

AAATGATTGTGGGGTTGACTTACGAAGACCCTAGTGTCAATTTGCAAAAA 

AGT GGT GC C AAT G T T T C T AT T GT AT AC C C GAC AG AAG G GAC AGT T T T T GT 

CCCATCTTCGGTTGCAATTATAAAGAATGCTCCTTCTATGAAAGAAGCAA 

AGTTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTTTGGG 

C AGT C AAC G AGT AAC C GAC C TAT T C GT AAAG AT G C C C AAAC AAGT AAT GG 

CAT G AAAG C T T T AAAGG AT AT CGCTACTCT T AAAG AAG AT TAT C G C T AT G 

T C AC T AAG CAT AAG AG C C AAAT CCT T AAAAC C TAT AAT C G C AT T C G T AG A 

AATGCTGAT 

SEQ ID NO. 6009 

STRAIN CJB110 

CAGCCTTTTAAACTACTTCCACCAAAAGAATTAGTTATTCT 
AAGTCCAAATAGTCAAGCCATTTTAACAGGAACGATTCCAGCTTTTGAGg 
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AAAAAT ACG GT AT AAAAGT T AAG C T TAT T C AAG GT GG GAC AGGG C AAC T A 
AT AGAT AGAT T AAGT AAG G AGGGT AAG C AGT T GAAG G CG G AT AT T T T CTT 

TGGAGGAAATTATACGCAATTTGAAAGTCATAAGGCATTGTTTGAGTCTT 
ACG T AT C AAAGAAT GT T CAT AC T GT T AT T C C AGAC TAT AT C CAT C C AAGT 

GATACGGCGACACCTTATACTATAAATGGGAGTGTCTTGATTGTAAATAA 
C GAAT T AG C T AAG GGACT T AC CAT C AAGAGT T AT GAAG AT T T AT T AC AG C 

CTTCCTTAAAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCCTCTAGT 
G CT T T C T CAC AAC T C ACT AAT AT AC T CT T GG C C AAG GGT GGT T AC AC C AA 
T C C AAAAGCG T GG AAC TAT GT T AAAAAG CT AC AAC AT AAT AT T AAT G C T A 
T C AAAT C T T CT AG CT CTT C AGAAGT T T AT C AAT C AGT T GC AGAAGGAAAA 
AT GAT TGTGGGGCT GAC T T AC GAAG AC C C T AGT GT C AAT T T G C AAAAAAG 

TGGTGCCAATGTTTCTATTGTATATCCGACAGAAGGGACAGTTTTTGTCC 
CAT CTT CGGT TGCAATT AT AAAGAAT GCTCCTT CT ATGAAAGAAGCAAAG 

TTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTTTGGGCA 
GT C AACG AGT AAC C G AC CT AT T C GT AAAG AT G C C C AAAC GAGT AAT GGC A 
T GAAAGC T T T AAAG GAT AT T G C T ACT C T T AAAG AAG AT TAT C G C T AT GT C 
ACT AAG C AT AAGG G C C AAAT C C T T AAAAC C TAT AAT C GT AT T C G TAG AAA 
TGCTGAT 

SEQ ID NO. 6010 

STRAIN 1169NT 

ATAGTCAAGCCATTTTAACAGGAACGATTCCAGCTTTTGAGGAAAAATAC 
GGTATAAAAGTTAAGCTTATTCAAGGTGGGACAGGGCAACTAATAGATAG 
AT T AAGT AAG G AGG GT AAG C AT T T GAAG G C GG AT AT T T T C T t T GG AG G AA 
AT TAT AC G C AAT T T G AAAGT CAT AAG G CAT T G TT T GAGT C T T AC G TAT C A 

AAGAATGTTCATACTGTTATTCCAGACTATATCCATCCAAGTGATACGGC 
GAC AC C T TAT AC TAT AAAT G GG AG T GT C T T GAT T G T AAAT AAC GAAT TAG 
C T AAG G G AC T T AC CAT C AAG AGT TAT GAAG AT T TAT T AC AG CCTTCCTTA 

AAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCCTCTAGTGCTTTCTC 
AC AAC T C AC C AAT AT AC T C T T GG C AAAGG GT G G T T AC AC C AAT C C AAAAG 
C GT GG AACT AT GT T AAAAAG C T AC AAC AT AAT AT T AAT G C TAT C AAAT C T 
T C T AG CT C T T C AG AAGT T TAT C AAT C AGT T G C AG AAGGAAAAAT GAT T G T 
GGGGTTGACTTACGAAGACCCTAGTGTCAATTtGCAAAAAAGTGGTGCCA 
ATGTTTCTATTGTATATCCGACAGAAGGGACAGTTTTTGTCCCATCTTCG 
G T T GC AAT TAT AAAG AAT GCTCCTTC TAT GAAAG AAG C AAAGT TAT T TAT 

TAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTTTGGGCAGTCAACGA 
GT AAC C GAC CT AT T C GT AAAG AT GCC C AAAC GAGT AAT GG CAT G AAAG CT 
T T AAAGG AT AT T G C T ACT CT T AAAG AAG AT TAT C G CT AT GT CAC T AAG C A 
T AAGG G C C AAAT C C T T AAAAC CT AT AAT CGT AT T CGT AG AAAT G CT GAT 

SEQ ID NO. 6011 

STRAIN JM91130013 

C AG C CTT CT AAAC T AC T T C CAC C AAAAG AAT TAG T 

TAT T C T AAGT C C AAAT AG T C AAG C CAT T T T AAC AG G AACG AT T C C AGC T T 
T T GAG G AAAAAT AC GGT AT AAAAGT T AAG C T TAT T C AAG GT G GGAC AGGG 

CAACTAATAGATAGATTAAGTAAGGAGGGTAAGCAGTTGAAGGCGGATGT 
TTTCTTTG G AGG AAAT TAT AC GC AAT T T G AAAGT CAT AAG G CAT T GT TT G 
AGT C T T AC G TAT C AAAG AATG T T CAT AC T G T TAT T C C AG AC TAT AT C CAT 
C C GAGT GAT AC GGC GAC AC CT TAT ACT AT AAAT GGG AGT G T C T T GAT T G T 
AAAT AAC GAAT TAG C T AAG GGACT T AC CAT C AAGAG T TAT GAAG AT T TAT 
TACAGCCTTCCTTAAAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCC 
TCTAGTGCTTTCTCACAACTCACCAATATACTCTTGGCAAAGGGTGGTTA 
CAC C AAT C C AAAAG CGT GGAAC T AT GT T AAAAAG C T AC AAC AT AAT AT T A 

ATGCTATCAAATCTTCTAGCTCTTCAGAAGTTTATCAATCAGTTGCAGAA 
GGCAAAATGATTGTGGGGCTGACTTACGAAGACCCTAGTGTCAATTTGCA 
AAAAAGTGGTGCCAATGTTTCTATTGTGTATCCGACAGAAGGGACAGTTT 
TTGTCCCATCTTCGGTTGCAATTATAAAGAATGCTCCTTCTATGAAAGAA 
GCAAAGTTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTT 
T G GG C AGT C AAC GAGT AAC C GAC C T AT T C GT AAAG AT G C C C AAAC GAGT A 
AT G G CAT GAAAG CTT T AAAG GAT AT T G CT AC T C T T AAAG AAG AT TAT C G C 
TAT GT C AC T AAG CAT AAG GGC C AAAT C C T T AAAAC CT AT AAT C GT AT T C G 
TAG AAAT G C T GAT 

SEQ ID NO. 6012 
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STRAIN 2 603 frame: 1 

MKEKQSKRLI YILLWS 1 1 FI SVFTYS I SQPSKLLPPKELVILS PNSQAILTGTI PAFEE 
KYG I KVKL I QGGTGQL I DRL S KE GKQ LKAD I FFGGN YT Q FE S HKAL FE S Y VS KN VHT V I P 
DYIHPSDTATPYTINGSVLIVNNELAKGLTIKSYEDLLQPSLKGKIAFADPNTSSSAFSQ 
LTNILLAKGGYTNPKAWNYVKKLQHNINAIKSSSSSEVYQSVAEGKMIVGLTYEDPSVNL 
QKSGANVSIVYPTEGTVFVPSSVAIIKNAPSMKEAKLFINFMLSLDVQNAFGQSTSNRPI 
RKDAQT SNGMKALKD I ATLKE D YRYVTKHKGQI LKT YNRI RRNAD 

SEQ ID NO. 6013 

STRAIN 090 frame: 1 

QPSKLLPPKELVILS PNSQAILTGTI PAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS S S S SEVYQS VAEGKMI VGLT YE DPS VNLQKSGANVS I VYPTEGTVFVPS SVAIIKNA 
PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 
KGQI LKT YNRI RRNAD 1 

SEQ ID NO. 6014 

STRAIN A909 frame: 1 

QPSKLLPPKELVILSPNSQAILTGTIPAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNIHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS SS S SEVYQS VAEGKMI VGLT YE DPS VNLQKSGANVS I VYPTEGTVFVPS SVAIIKNA 
PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 
KGQILKTYNRI RRNAD 

SEQ ID NO. 6015 

STRAIN H3 6B frame: 2 

KLLPPKELVILS PNSQAILTGTI PAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKADIF 
FGGNYTQFESHKALFESYVSKNIHTVIPDYIHPSDTATPYTINGSVLIVNNELVKGLTIK 
SYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINAIKS 
S S S SEVYQS VAEGKMI VGLT YEDPS VNLQKSGANVS IVYPTEGTVFVPS SVAI IKNAPSM 
KEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKHKGQ 
ILKTYNRIRRNAD 

SEQ ID NO. 6016 

STRAIN 18RS21 frame: 1 

QPSKLLPPKELVILS PNSQAI LTGT I PAFEEKYGIKVKLIQGGTGQLI DRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS S S S SEVYQSVAEGKMIVGLTYE DPS VNLQKSGANVS IVYPTEGTVFVPS SVAI IKNA 
PSMKEAKLF IN FMLSLDVQNAFGQSTSNRPIRKDAQT SNGMKALKD I AT LKEDYRYVTKH 
KGQI LKT YNRI RRNAD 

SEQ ID NO. 6017 

STRAIN M732 frame: 1 

QPSKLLPPKELVILS PNSQAILTGT I PAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS SSSSEVYQSVAEGKMIVGLTYEDPSVNLQKSGANVS IVYPTEGTVFVPS SVAIIKNA 
P SMKE AKL FIN FML S LD VQNAFGQS T SNRP I RKDAQT SNGMKALKD I AT LKE DYRYVTKH. 
KS Q I LKT YNR I RRNAD 

SEQ ID NO. 6018 

STRAIN COH1 frame: 1 

QPSKLLPPKELVILS PNSQAILTGT I PAFEEKYGIKVKLIQGGTGQLI DRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS S S S SEVYQS VAEGKMI VGLTYE DPSVNLQKSGANVS I VYPTEGTVFVPS SVAI IKNA 
PSMKEAKL FIN FMLSLDVQNAFGQSTSNRPIRKDAQT SNGMKALKDI ATLKE DYRYVTKH 
K S Q I LKT YNR I RRN A D 

SEQ ID NO. 6019 

STRAIN M7 81 frame: 1 
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QPSKLLPPKELVILSPNSQAILTGTIPAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKSSSSSEVYQSVAEGKMIVGLTYEDPSWLQKSGANVSIVYPTEGTVFVPSSVAIIKNA 
PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 
KSQILKTYNRIRRNAD 

SEQ ID NO. 6020 

STRAIN CJB110 frame: 1 

QPFKLLPPKELVILSPNSQAILTGTIPAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 

DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 

TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA . 

IKSSSSSEVYQSVAEGKMIVGLTYEDPSVNLQKSGANVSIVYPTEGTVFVPSSVAIIKNA 

PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 

KGQILKTYNRIRRNAD 

SEQ ID NO. 6021 

STRAIN 1169NT frame: 3 

SQAILTGTIPAFEEKYGIKVKLIQGGTGQLIDRLSKEGKHLKADIFFGGNYTQFESHKAL 
FESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGLTIKSYEDLLQPSLKGKI 
AFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINAIKSSSSSEVYQSVAEGK 
MIVGLTYEDPSVNLQKSGANVSIVYPTEGTVFVPSSVAIIKNAPSMKEAKLFINFMLSLD 
VQNAFGQSTSNRPIRKDAQTSNGMBCALKDIATLKEDYRYVTKHKGQILKTYNRIRRNAD 

SEQ ID NO. 6022 

STRAIN JM91130013 frame: 1 

QPSKLLPPKELVILSPNSQAILTGTIPAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 
DVFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS S S S SEVYQS VAEGKMI VGLT YEDPSVNLQKSGANVS I VYPTEGTVFVPS S VAI IKNA 
PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 
KGQILKTYNRIRRNAD 

SEQ ID NO. 6101 
STRAIN 2603 

ATGGTAAAAGTTAGTGTAAGTTCTGTAGGAACTCAAGCATCAACAGTAGCTATTTCTATG 
T T T AGT CG T GT AT C GG C T T T AAAT GAT G C AAT AAC AAAAC T AT CAT CT T T T G C AG AGG CT 
GCAACTCTTCAAGGGACTGCTTATTCAAATGCAAAAAGCTATGCTACTGGAACGTTAACT 
CCGATGCTTCAAGGAATGATTCTTTTCTCTGAAACATTGAGTGAGAAATGTACAGAATTA 
CAAACCTTATATGTCTCAATTTGTGGTGATGAGGATTTAGACTCTGTCGTTTTAGAATCA ' 
AAAT T AGC AAG T G AT AGGG C AT CAT T AAAG AT T G C T G AAG C ACT T T TAG AG CAT C T T AAC 
GAT GAT C C AG AAC C T T C C AAAT CT G C C AT AAGT T C T AC AAAAAGT AAT AT T AAAAAAT T A 
AAAAAACGTATAAAATCTAATCAAAA.GAAATTAGACAACCTTAATGAATTTAACGCCCAT 
T C AG C AAC AGT AT T T G C GG AC AT T T CT AAT G C AC AGT C AAC T GT T AAC C AAG C ACT AG C G 
G CT GT T T C AAC AGG AT T T T CT G GAT AT AAT AGT AAAAC C GGAG CT T T T G G AAAAC C AAC A 
T C C G GAC AG AT GG AAT G G AC AAAG AC AG T T AAG AAG AAT T G G AAAG AG C GAG AAG AC G C C 
AAAG C T G AAGAACT G AAAAG T AAAAAG G C T G AAG AAAG T AAG AAAG CT T C AAAAAT T G AA 
AAT AC T AC T AAAAAAAG T AAT GT T T C AGT T GAT AAAAAG AAAT T AAT AAAAG C G G C T AAT 
G AAG C GT AT AAAT TAG G AGAAAT T AAAAAAG AT AC C T AT G AAT C AAT TAT C AGT G G T T T A 
AGT AATGCATCGGCTGCCTTACTTAAAGAGGTAGCT AAAT CAAAATTGACT GAC ACAGCT 
CGGCTATTGATG 

SEQ ID NO. 6102 

STRAIN 0 90 

T T AAAT GAT GC AAT AAC AAAAC TAT CAT C T T T T G C AG AGG C T 
GC AACT CT TCAAGGGACTGCTT ATT C AAAT GC AAAAAG CTATGCT AC TGG 
AACGTTAACTCCGATGCTTCAAGGAATGATTCTTTTCTCTGAAACATTGA 
G T G AG AAAT GT AC AG AAT T AC AAAC C T TAT AT GT C T C AAT T T GT G G T GAT 
G AGG AT T TAG AC TCTGTCGTT T TAG AAT C AAAAT T AG C AAGT GAT AGG G C 
AT C AT T AAAGAT T G C T G AAG C AC T T T TAG AG CAT C T T AAC GAT GAT C C AG 
AACCTT C C AAAT CTGCC AT AAGTTCT AC AAAAAGT AAT ATT AAAAAAT T A 
AAAAAAC G TAT AAAAT C T AAT C AAAAG AAAT TAG AC AAC C T T AAT G AAT T 
T AACG C C CAT T C AG C AAC AGT AT T T G C G GAC AT T T C T AAT G C AC AGT C AA 
CT GT T AAC C AAG C AC TAG CGGCTGTTT C AAC AG GAT T T T C T GG AT AT AAT 
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AGT AAAAC CG GAG C T T T T G G AAAACC AAC AT C C GG AC AG AT GG AAT G G AC 
AAAGAC AGT T AAG AAGAAT T GGAAAGAG CG AG AAGAC G C C AAAG CT G AAG 
AAC T GAAAAGT AAAAAGG CT G AAG AAAGT AAG AAAG C T T C AAAAAT T GAA 
AATACTACTAAAAAAAGTAATGTTTCAGTTGATAAAAAGAAATTAATAAA 
AGCGGCTAATGAAGCGTATAAATTAGGAGAAATTAAAAAAGATACCTATG 
AATCAATTATCAGTGGTTTAAGTAATGCATCGGCTGCCTTACTTAAAGAG 
GT AG C T AAAT C AAAAT T G ACT GAC AC AG CT CG G C TAT T GAT G 

SEQ ID NO. 6103 

STRAIN 18RS21 

T T AAAT GAT G C AAT AAC AAAAC TAT CAT C T T T T G C AG AG G C 
TGCAACTCTTCAAGGGACTGCTTATTCAAATGCAAAAAGCTATGCTACTG 
GAACGTTAACTCCGATGCTTCAAGGAATGATTCTTTTCTCTGAAACATTG 
AG T GAG AAAT GT AC AG AAT T AC AAAC C T T AT AT GT C T C AAT T T GT G GT G A 
T G AGG AT T T AGAC TCTGTCGTTT TAG AAT C AAAAT TAG C AAGT GAT AG GG 
CAT CAT T AAAGAT T G C T G AAG C AC T T T TAG AG CAT C T T AAC GAT GAT C C A 
G AAC C T T C C AAAT CT G C CAT AAGT T CT AC AAAAAGT AAT AT T AAAAAAT T 
AAAAAAAC G TAT AAAAT C T AAT C AAAAG AAAT TAG AC AAC C TT AAT G AAT 
T T AAC GC C CAT T C AG C AAC AGT AT T T G CGG AC AT T T CT AAT G C AC AGT C A 
AC T GT T AAC C AAGC AC T AGCGG C T GT T T C AAC AGGAT T T T C T G G AT AT AA 
TAGTAAAACCGGAGCTTTTGGAAAACCAACATCCGGACAGATGGAATGGA 
C AAAG AC AG T T AAG AAGAAT T G G AAAG AG C GAG AAG AC G C C AAAGC T GAA 
G AACT GAAAAGT AAAAAGGCTG AAG AAAGT AAG AAAGCT T C AAAAAT T GA 
AAAT ACT ACT AAAAAAAGTAATGT TT C AGT TGAT AAAAAGAAAT T AAT AA 
AAG C G G C T AAT GAAG C GT AT AAAT T AGG AG AAAT T AAAAAAG AT AC C T AT 
G AAT C AAT TAT C AGT G GT T T AAG T AAT G CAT CGGCTGCC T T AC T T AAAGA 
G G TAG CT AAAT C AAAAT T GAC T G AC AC AGC T C GG C T AT T GAT G 

SEQ ID NO. 6104 

STRAIN 2 603 frame: 1 

MVKVS VS S VGTQASTVAISMFSRVS ALNDAITKLS S FAEAATLQGTAYSNAKS YATGTLT 

PMLQGMILFSETLSEKCTELQTLYVSICGDEDLDSVVLESKLASDRASLKIAEALLEHLN 

DDPEPSKSAISSTKSNIKKLKKRIKSNQKKLDNLNEFNAHSATVFADISNAQSTVNQALA 

AVSTGFSGYNSKTGAFGKPTSGQMEWTKTVKKNWKEREDAKAEELKSKKAEESKBCASKIE 

NTTKKSNVSVDKKKLIKAANEAYKLGEIKKDTYESIISGLSNASAALLKEVAKSKLTDTA 
RLLM 

SEQ ID NO. 6105 

STRAIN 090 frame: 1 

LNDAITKLSS FAEAATLQGTAYSNAKS YATGTLTPMLQGMILFSETLSEKCTELQTLYVS 
ICGDEDLDSWLESKLASDRASLKIAEALLEHLNDDPEPSKSAISSTKSNIKKLKKRIKS 
NQKKLDNLNEFNAHSATVFADISNAQSTVNQALAAVSTGFSGYNSKTGAFGKPTSGQMEW 
TKTVKKNWKEREDAKAEELKSKKAEESKKASKIENTTKKSNVSVDKKKLIKAANEAYKLG 
EIKKDTYESIISGLSNASAALLKEVAKSKLTDTARLLM 

SEQ ID NO. 6106 

STRAIN 18RS21 frame: 1 

LNDAITKLSS FAEAATLQGTAYSNAKSYATGTLTPMLQGMILFSETLSEKCTELQTLYVS 

ICGDEDLDSVVLESKLASDRASLKIAEALLEHLNDDPEPSKSAISSTKSNIKKLKKRIKS 

NQKKLDNLNEFNAHSATVFADISNAQSTVNQALAAVSTGFSGYNSKTGAFGKPTSGQMEW 

TKTVKPCNWKEREDAKAEELKSKKAEESKKASKIENTTKKSNVSVDKKKLIKAANEAYKLG 
EIKKDTYES I ISGLSNASAALLKEVAKSKLTDTARLLM 

SEQ ID NO. 6201 
STRAIN 2603 

ATGATTTTAAAAATTTGTCGTGCAGCATATAGTTTACAATGGGGAGGTGTTTACCAATTA 
GCTTTGCT GG ATT AT C C T C G AAT T AAGG C GT T T G AAT T G G AAAG GAT AGG AG C T T T CAT A 
G C T T AC GAG AAAC AAT AT AAAAG AAAAAC T G AGAT AC AAT G T GAC GAT AAAC AT C T C C T C 
G C AAAAAT T GT T C AT TT T TT AAAAT AC AAT AGT TT TACTTTTCCCT AT ATT CCC AAAT AT 
AGAGAAGCGGCAGCTACTTTTAATGAGGATGGTATTAGTTTAACTTCTGATTTTTTAAGC 
CATACATGTACGATTGAAACTGCAAAACTAATTTTTAAAGAAGGTAAAATCTTATCAGCA 
GTTAAAGCCTTTAATAAGCCTGCTGAAGTACTGGTAAAAGATAAGAGGAATGCTGCTGGA 
GACCCTAAAGATTACTTTGACTATGTGATGTTGAACTGGTCAAATACCAATTCTGGTTAT 
CGTTTAGTAATGGAAAGATTGTTAGGCAAAGCACCATCTGAACAGGAGTTAACAGTAGGT 
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TTTAAGCCAGGGGTCAGTTTTCATTTTACTTATCAAGATATCATCAATCATCCTGATTCT 
AT T T T T GAT G GT TAT CAT C CT G C T AAAAT T AAAAAT C AG CT T T C T T TAG C AGAAC AT T T A 
GTTGCATGTGTTATCCCAAAACATTATCAAGAAGATTATCAAAGCCTTGTGCCCAATGAC 
TTGAAACACAGGGTTTATTATTTAGATTACTGTAACGAAACACTTTATGAGTGGAATCAA 
AAAGTTTATGATTTTCTTTGTCATTTGGAAAATAAA 

SEQ XD NO. 6202 
STRAIN 090 

TGGATTATCCTCTAATTAAGGCGTTTGAATTGGAAAGGATAGGAGCTTTC 
AT AG C T T AC GAG AAAC AAT AT AAAAGAAAAAT T G AG AT AC AAT GT GACG A 
TAAACATCTCCTCACAAAAATTGTTCATTTTTTAAAATACAATAGTTTTA 
C T T T T C C CT AT AT T C C C AAAT AT AG AG AAG C GG C AG CT ACT T T T AAT GAG 
GAT GGT AT T AGT T T AAC T T C T GAT T T T T T AAG C CAT AC AT G T AC GAT T GA 
AACT GC AAAAC T AAT T T T TAAAG AAG GT AAAAT CT TAT C AG C AGTT AAAG 
C CT T T AAT AAGC C T GC T GAAGT AC T GGT AAAT G AT AAGAG G AAT G C T GC T 
GGAGACC CT AAAG AT T ACT T TG ACT ATGTGATGTTG AACT GGT C AAAT AC 
C AAT T CT GGT TAT C GT T TAG T AAT G G AAAG AT T GT TAG G C AAAG C AC CAT 
C T G AAC AGG AGT T AAC AGT AG CT T T T AAGC C AG G G GT C AG C T T T C ATT T T 
AAT T a T C AAGAT AT CAT C AAT CAT C CT G AT T C T AT T T T T GAT GGTT AT C A 
TCCTGCTAAAATT AAAAAT CAACTTTCTTTAGCAGAACATTT AGTT G CAT 
GT G T T AT C C C AAAAC AT TAT C AAGAAG AT T AT C AAAG CCTTGTGC C T AAT 
G AC T T G AAACAC AG AGT T T AT T AT T TAG AT TACT GT AAC G AAAC AC T T T A 
T G AGT G G AAT C AAAAAGT T TAT GAT TTTCTTTGT CAT T T GGAAAAT AAA 

SEQ ID NO. 6203 
STRAIN A90 9 

TTGCTGGATTATCCTCGAATTAAGGCGTTTGAATTGGAAAGGATA 

GG AG C T T T CAT AG C T T AC GAGAAAC AAT AT AAAAGAAAAAT T GAG AT AC A 

AT GT G AC GAT AAAC AT C T C C T C AC AAAAAT T G T T CAT T T T T T AAAAT AC A 

ATAGTTTTACTTTTCCCTATATTCCCAAATATAGAGAAGCGGCAGCTACT 

TTTAATGAGGATGGTATTAGTTTAACTTCTGATTTTTTAAGCCATACATG 

T ACG AT T G AAAC T G C AAAACT AAT T T T TAAAG AAGGT AAAAT C T T AT C AG 

C AGT TAAAG C CT T T AAT AAG C C T G C T GAAGT ACT GGT AAAT GAT AAG AG G 

AATGCTGCTGG AG ACCCT AAAG AT TACTT TG ACT ATGTGATGTTG AACT G 

GTCAAATACCAATTCTGGTTATCGTTTAGTAATGGAAAGATTGTTAGGCA 

AAG C AC CAT CT G AAC AG G AGT T AAC AGT AG C T T T T AAG C C AGG G GT C AG C 

TTTCATTTTAATTATCAAGATATCATCAATCATCCTGATTCTATTTTTGA 

T GG T T AT CAT C C T G CT AAAAT T AAAAAT C AAC T T T CT T TAG C AG AAC AT T 

TAGTTGCATGTGTTATCCCAAAACATTATCAAGAAGATTATCAAAGCCTT 

G T G C CT AAT GAC T T G AAAC AC AG AGT T T AT TAT T TAG AT TACT G T AAC G A 

AACACTTTATGAGTGGAATCAAAAAGTTTATGATTTTCTTTGTCATTTGG 

AAAATAAA 

SEQ ID NO. 6204 
STRAIN H36B 

TTAAGGCGTTTGAATTGGAAAGGATAGGAGCTTTCATAGCTTACGAGAAA 
C AAT AT AAAAGAAAAAT T GAG AT AC AAT GT GACGAT AAAC AT CT CCT C AC 
AAAAATTGTTCATTTTTTAAAATACAATAGTTTTACTTTTCCCTATATTC 
C C AAAT AT AG AG AAG CG G C AG C T AC T T T T AAT GAG GAT G GT AT TAG T T T A 
ACT T C T GAT T T T T T AAG C CAT AC AT GT AC GAT T G AAACT GC AAAAC T AAT 
T T T TAAAG AAG GT AAAAT C T TAT C AG C AGT TAAAG C C T T T AAT AAG C CT G 
CTGAAGTACTGGTAAATGATAAGAGGAATGCTGCTGGAGACCCTAAAGAT 
T ACTT TGACT AT GT GAT GTTGAACT GGT C AAAT ACC AAT TCT GGTT ATCG 
T T T AG T AAT GG AAAG AT T GT T AG GC AAAGC AC CAT C T G AAC AG GAG T T AA 
CAGTAGCTTTTAAGCCAGGGGTCAGCTTTCATTTTAATTATCAAGATATC 
ATCAATCATCCTGATTCTATTTTTGATGGTTATCATCCTGCTAAAATTAA 
AAATCAACTTTCTTTAGCAGAACATTTAGTTGCATGTGTTATCCCAAAAC 
ATT AT CAAGAAGATT AT C AAAGC CT TGT GC CT AAT GACT TGAAAC ACAGA 
GTT T ATT AT T T AGATT ACT GT AACGAAAC ACT TT AT GAGTGG AAT CAAAA 
AGTTTATGATTTTCTTTGTCATTTGGAAAATAAA 

SEQ ID NO. 6205 
STRAIN 18RS21 

TTGCTGGATTATCCTCGAATTAAGGCGTT 
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TGAATTGGAAAGGATAGGAGCTTTCATAGCTTACGAGAAACAATATAAAA 
G AAAAACT G AG AT AC AAT GT G AC G AT AAAC AT CT C C T C G C AAAAAT T GT T 
CAT T T T T T AAAAT AC AAT AGT T T T AC TT T T C C CT AT AT T C CC AAAT AT AG 
AGAAGCGGCAGCTACTTTTAATGAGGATGGTATTAGTTTAACTTCTGATT 
TTTTAAGCCATACATGTACGATTGAAACTGCAAAACTAATTTTTAAAGAA 
GGTAAAATCTTATCAGCAGTTAAAGCCTTTAATAAGCCTGCTGAAGTACT 
GGT AAAAG AT AAGAGG AAT GC T G CT G GAG AC C C T AAAGAT T ACT T T GAC T 
ATGTGATGTTGAACTGGTCAAATACCAATTCTGGTTATCGTTTAGTAATG 
GAAAGATTGTTAGGCAAAGCACCATCTGAACAGGAGTTAACAGTAGGTTT 
T AAGCCAGGGGTCAGTTTTCATTT TACT TAT CAAGAT AT CATC AAT CATC 
CTGATTCTATTTTTGATGGTTATCATCCTGCTAAAATTAAAAATCAGCTT 
T C T T TAG C AG AAC AT T T AG T T G CAT G T G T TAT C C C AAAAC AT TAT C AAG A 
AG AT TAT C AAAG CCTTGTGC C C AAT GAC T T G AAAC AC AGGGT T TAT TAT T 
TAG AT T AC T GT AAC G AAAC ACT T TAT GAG T GG AAT C AAAAAGT T T AT GAT 
TTTCTTTGTCATTTGGAAAATAAA 

SEQ ID NO. 6206 
STRAIN M732 

TTGCTGGATTATCCTCGAATTAAGGCGTT 

T G AAT T GG AAAGGAT AGG AGC T T T C AT AG CT TAG GAG AAAC AAT AT AAAA 
G AAAAAC T GAG AT AC AAT GT GAC GAT AAACAT CT C C T C G C AAAAAT T GT T 
CAT T T T T T AAAAT AC AAT AGT T TT ACT T TT C CCT AT AT T C CC AAAT AT AG 
AG AAG CGG C AG C T AC T T T T AAT GAG GAT GGT AT T AGT T T AAC T T CT G AT T 
TTTTAAGCCATACATGTACGATTGAAACTGCAAAACTAATTTTTAAAGAA 
GGT AAAAT C T TAT C AG C AGT T AAAG C C T T T AAT AAG C CT G C T G AAGT AC T 
G GT AAAAG AT AAG AGG AAT G C T G C T GG AGAC CC T AAAG AT T AC T T T GAC T 
ATGTGATGTTGAACTGGTCAAATACCAATTCTGGTTATCGTTTAGTAATG 
GAAAGAT T GT T AG GC AAAG C AC CAT C T G AAC AG GAG T T AAC AGT AGGT T T 
T AAG C C AGGGGT C AG T T T T CAT T T TAG T TAT CAAGAT AT CAT C AAT CAT C 
C T GAT T CT AT T T T T GAT GGT TAT CAT C C T G C T AAAAT T AAAAAT C AG C T T 
T CT T TAG C AG AAC AT T T AGT T G CAT GT G T TAT C C C AAAAC AT TAT C AAG A 
AG AT TAT C AAAG C CT T GT G CC C AAT G AC TT G AAAC AC AG G GT T TAT TAT T 
TAG AT TACT GT AAC G AAAC AC T T TAT G AGT G G AAT C AAAAAGT T T AT GAT 
TTTCTTTGnCATTTGGAAAATAAA 

SEQ ID NO. 6207 
STRAIN COH1 
TTGCTGGAT 

TATCCTCGAATTAAGGCGTTTGAATTGGAAAGGATAGGAGCTTTCATAGC 
TT AC GAGAAAC AAT AT AAAAG AAAAACT GAGAT AC AAT GT GACGAT AAAC 
ATCTCCTCGCAAAAATTGTTCATTTTTTAAAATACAATAGTTTTACTTTT 
CCCTATATTCCCAAATATAGAGAAGCGGCAGCTACTTTTAATGAGGATGG 
TATTAGTTTAACTTCTGATTTTTTAAGCCATACATGTACGATTGAAACTG 
CAAAACTAATTTTTAAAGAAGGT AAAAT CT TAT CAGCAGTTAAAGCCTTT 
AATAAGCCTGCTGAAGTACTGGTAAAAGATAAGAGGAATGCTGCTGGAGA 
C C C T AAAG AT T AC T T T GAC TAT GT GAT GT T G AAC T GGT C AAAT AC C AAT T 
C T GG T TAT C GT T T AGT AAT GGAAAGAT T GT TAG G C AAAG C AC CAT C T G AA 
CAGGAGTT AACAGT AGGT TTTAAGCC AGGGGT CAGTTTTCATTTT ACT T A 
TCAAGATATCATCAATCATCCTGATTCTATTTTTGATGGTTATCATCCTG 
CT AAAAT T AAAAAT C AG C T T T C T T TAG C AG AAC AT T T AGT T G C AT GT GT T 
ATCCCAAAACATTATCAAGAAGATTATCAAAGCCTTGTGCCCAATGACTT 
G AAAC AC AGGGT T TAT TAT T TAG AT TACT G T AAC G AAAC AC T T TAT G AGT 
GGAATCAAAAAGTTTATGATTTTCTTTGGCATTTGGAAAATAAA 

SEQ ID NO. 6208 
STRAIN M7 81 

TTGCTGGA 

T T AT C C T C G AAT T AAG G CGT T T G AAT T G G AAAG GAT AG G AG CT T T CAT AG 
CT T ACGAG AAAC AAT AT AAAAG AAAAACT GAGAT AC AAT GT GACGAT AAA 
CATCTCCTCGCAAAAATTGTTCATTTTTTAAAATACAATAGTTTTACTTT 
T C C C TAT AT T C C C AAAT AT AG AG AAG CGG C AG CT AC T T T T AAT GAG GAT G 
G TAT TAG T T T AACT T CT GAT T T T T T AAG C CAT AC AT G T AC G AT T G AAAC T 
G C AAAAC T AAT T T T T AAAG AAGGT AAAAT C T T AT C AG C AGT T AAAG C C T T 
T AAT AAG C CT G CT G AAGT ACT G GT AAAAG AT AAG AGG AAT G CT G CT GG AG 
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AC C CT AAAGAT T ACT T T G AC T AT GT G AT G T T G AACT G G T C AAAT AC C AAT 
TCTGGTTATCGTTTAGTAATGGAAAGATTGTTAGGCAAAGCACCATCTGA 
ACAGGAGTTAACAGTAGGTTTTAAGCCAGGGGTCAGTTTTCATTTTACTT 
AT C AAG AT AT CAT C AAT CAT C C T GAT T C TAT T T T T G AT GGT T AT CAT C CT 
G CT AAAAT T AAAAAT C AGC T T T C T T TAG C AG AAC AT TT AGT T G C AT GT GT 
TAT C C C AAAAC AT T AT C AAGAAG AT TAT C AAAG CCTTGTGCC C AAT G ACT 
TGAAACACAGGGTTTATTATTTAGATTACTGTAACGAAACACTTTATGAG 
T GG AAT C AAAAAGT T TAT GAT TTTCTTTGT CAT T T GG AAAAT AAA 

SEQ ID NO. 6209 
STRAIN CJB110 

TTGCTGGATTATCCTCGjAATTAAGGC 

GTTTGAATTGGAAAGGATAGGAGCTTTCATAGCTTACGAGAAACAATATA 
AAAG AAAAAT T G AGAT AC AAT GT G AC G AT AAAC AT CT C C T C AC AAAAAT T 
GTT CAT TTTTT AAAAT ACAATAGTTTTACTTTTCCCT AT ATT CCCAAATA 
TAGAGAAGCGGCAGCTACTTTTAATGAGGATGGTATTAGTTTAACTTCTG 
AT T T T T T AAG C CAT AC AT G T AC GAT T G AAACT G C AAAAC T AAT T T T T AAA 
GAAGGTAAAATCTTATCAGCAGTTAAAGCCTTTAATAAGCCTGCTGAAGT 
ACTGGTAAATGATAAGAGGAATGCTGCTGGAGACCCTAAAGATTACTTTG 
ACTATGTGATGTTGAACTGGTCAAATACCAATTCTGGTTATCGTTTAGTA 
AT GG AAAG AT T GT T AGGC AAAG C AC CAT C T G AAC AGG AGT T AAC AGT AG C 
TTTTAAGCCAGGGGTCAGCTTTCATTTTAATTATCAAGATATCATCAATC 
ATCCTGATTCTATTTTTGATGGTTATCATCCTGCTAAAATTAAAAATCAA 
C T T T C T T T AG C AGAAC AT T T AGT T G C AT GT GT T AT C C C AAAAC AT TAT C A 
AG AAG AT TAT C AAAG CCTTGTGC C T AAT G ACT T G AAAC AC AG AGT T TAT T 
AT T TAG AT T ACT GT AAC G AAAC AC T T TAT GAG T G G AAT C AAAAAGT T TAT 
GATTTTCTTTGTCATTTGGAAAATAAA 

SEQ ID NO. 6210 
STRAIN 1169NT 

AATTAAGGCGTTTGAATTGGAAAGGATAGGAGCTTTCATAGCTTACGAGA 
AAC AAT AT AAAAG AAAAACT G AGAT AC AAT GT G AC GAT AAAC AT CT C C T C 
GCAAAAATTGTTCATTTTTTAAAATACAATAGTTTTACTTTTCCCTATAT 
T C CC AAAT AT AG AG AAG CG GC AG C T AC T T T T AAT GAG GAT G GT AT T AGT T 
T AAC T T C T GAT T T T T T AAG C CAT AC AT GT AC GAT T G AAAC T G C AAAAC T A 
AT T T T T AAAG AAGGT AAAAT C T TAT C AG C AGT T AAAG C C T T T AAT AAGC C 
T G C T GAAGT AC T GGT AAAT GAT AAG AGG AAT G CT G C T G GAG AC C CT AAAG 
AT TAG T T T G AC T AT GT G AT GTT GAACT GGT C AAAT AC C AAT T C T GGT TAT 
CGTTTAGTAATGGAAAGATTGTTAGGCAAAGCACCATCTGAACAGGAGTT 
AACAGTAGGTTTTAAGCCAGGGGTCAGCTTTCATTTTACTTATCAAGATA 
TCATCAATCATCCTGATTCTATTTTTGATGGTTATCATCCTGCTAAAATT 
AAAAATCAGCTTTCTTTAGCAGAACATTTAGTTGCGTGTGTTATCCCAAA 
AC AT TAT C AAG AAG AT TAT C AAAAT C T T GT G C C C AAT G AC T T G AAAC AC A 
G AGT T TAT TAT T T AGAT T AC T GT AAC G AAAC AC T T T AT G AGT GG AAT C AA 
AAAGTTTATGATTTTCTTTGTCATTTGGAAAATAAA 

SEQ ID NO. 6211 
STRAIN JM9130013 

ATAGGAGCTTTCATAGCTTACGAGAAACAATATAAAAGAAAAATTGAGAT 
ACAATGTGACGATAAACATCTCCTCACAAAAATTGTTCATTTTTTAAAAT 
AC AAT AG T T T T ACT T T T C C C TAT AT T C C C AAAT AT AG AG AAG C G G C AG C T 
AC T T T T AAT GAG G AT GG TAT T AGT T T AAC T T C T GAT T T T T T AAGC CAT AC 
AT GT AC GAT T G AAAC T G C AAAAC T AAT T T T T AAAG AAG G T AAAAT C T TAT 
C AG C AGT T AAAGC CT T T AAT AAG C C T G C T GAAGT AC T G GT AAAT GAT AAG 
AGGAATGCTGCTGGAGACCCTAAAGATTACTTTGACTATGTGATGTTGAA 
CTGGTCAAATACCAATTCTGGTTATCGTTTAGTAATGGAAAGATTGTTAG 
G C AAAG C AC CAT C T G a AC AG GAG T T AAC AG T AG CT T T T AAG C C AG GGGT C 
AG CT T T CAT T T T AAT TAT C AAG AT AT CAT C AAT CAT C C T GAT T C TAT T T T 
T GAT G G T T AT CAT C CT G CT AAAAT T AAAAAT C AAC T T T C T T T AG C AG AAC 
ATT TAGTTGCAT GTGTT AT CCCAAAACATT AT CAAGAAG AT T AT C AAAGC 
C T T GT G C C T AAT G AC T T G AAAC AC AG AG T T T AT T AT T T AG AT T ACT GT AA 
CG AAAC ACT TT AT GAGTGG AAT C AAAAAGT TTATG ATT TTCTTTGTCATT 
T G G AAAAT AAA 
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SEQ ID NO. 6212 

STRAIN 2 603 frame: 1 

MILKICRAAYSLQWGGVYQLALLDYPRIKAFELERIGAFIAYEKQYKRKTEIQCDDKHLL 
AKIVHFLKYNSFTFPYIPKYREAAATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSA 
VKAFNKPAEVLVKDKRNAAGDPKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVG 
FKPGVSFHFTYQDIINHPDSIFDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPND 
LKHRVYYLDYCNETLYEWNQKVYDFLCHLENK 

SEQ ID NO. 6213 

STRAIN A909 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKIEIQCDDKHLLTKIVHFLKYNSFTFPYIPKYR 
EAAATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVNDKRNAA.GD 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVAFKPGVSFHFNYQDIINHPDSI 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLCHLENK 

SEQ ID NO. 6214 

STRAIN H3 6B frame: 3 

KAFELERIGAFIAYEKQYKRKIEIQCDDKHLLTKIVHFLKYNSFTFPYIPKYREAAATFN 
EDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVNDKRNAAGDPKDYFDY 
VMLNWSNTNSGYRLVMERLLGKAPSEQELTVAFKPGVS FHFNYQDI INHPDS I FDGYHPA 
KIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQKVYDFLCH 
LENK 

SEQ ID NO. 6215 

STRAIN 18RS21 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKTEIQCDDKHLLAKIVHFLKYNSFTFPYIPKYR 
EAAATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVKDKRNAAGD 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVGFKPGVS FHFTYQDI INHPDS I 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLCHLENK 

SEQ ID NO. 6216 

STRAIN M7 32 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKTEIQCDDKHLLAKIVHFLKYNSFTFPYIPKYR 
EAAATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVKDKRNAAGD 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELT VGFKPGVS FHFTYQDI INHPDS I J 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLXHLENK 

SEQ ID NO. 6217 

STRAIN COH1 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKTEIQCDDKHLLAKIVHFLKYNSFTFPYIPKYR 
EAAATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVKDKRNAAGD 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVGFKPGVS FHFTYQDI INHPDS I 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLWHLENK 

SEQ ID NO. 6218 

STRAIN M781 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKTEIQCDDKHLLAKIVHFLKYNSFTFPYIPKYR 
EAAATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVKDKRNAAGD 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVGFKPGVS FHFTYQDI INHPDS I 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLCHLENK 

■ 

SEQ ID NO. 6219 

STRAIN CJB110 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKIEIQCDDKHLLTKIVHFLKYNSFTFPYIPKYR 
E AAAT FNE DGISLTSDFLS HT CT I ET AKL I FKE GK I L S AVKAFNK P AE VL VN DKRN AAG D 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVAFKPGVS FHFNYQDI INHPDS I 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLCHLENK 
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SEQ ID NO. 6220 

STRAIN 1169NT frame: 2 

I KAFE LERI GAFI AYE KQ YKRKTE I QC D DKHLLAKI VH FLKYN S FT FP Y I PKYRE AAAT F 
NEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVNDKRNAAGDPKDYFD 
YVMLNWSNTNSGYRLVMERLLGKAPSEQELTVGFKPGVSFHFTYQDIINHPDSIFDGYHP 

AKIKNQLSLAEHLVACVIPKHYQEDYQNLVPNDLKHRVYYLDYCNETLYEWNQKVYDFLC 
HLENK 

SEQ ID NO. 6221 

STRAIN JM9130013 frame: 1 

IGAFIAYEKQYKRKIEIQCDDKHLLTKIVHFLKYNSFTFPYI PKYRE AAAT FNEDGISLT 
SDFLSHTCTIETAKLIFKEGKILSAVPCAFNKPAEVLVNDKRNAAGDPKDYFDYVMLNWSN 
TNSGYRLVMERLLGKAPSEQELTVAFKPGVSFHFNYQDIINHPDSIFDGYHPAKIKNQLS 
LAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQKVYDFLCHLENK 

SEQ ID NO. 6222 

STRAIN 090 frame: 3 

DYPLIKAFELERIGAFIAYEKQYKRKIEIQCDDKHLLTKIVHFLKYNSFTFPYIPKYREA 
AATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVNDKRNAAGDPK 
DYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVAFKPGVSFHFNYQDIINHPDSIFD 

GYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQKVY 
DFLCHLENK 

SEQ ID NO. 6301 
STRAIN 2603 

AT G AAAAGT CG AAAAAAAG AT AAAT T G GT AT T GAG G T T AAC AAC AAC AC TAT TGGTTTTT 
GGTTTGGGTGGGGTTTGGTTTTATAATTATAAAAATGATAATGTCGAACCGACAGTCACT 
AGT G CAT CGG AT C AAAC G AC G AC T T T TAT T C AAAC GAT T T C T C C AAC AG C TAT T G AAAT T 
T C T AAG AC C TAT GAT T T GT AT G C G T C AG T CT T AT T AG C AC AAGC T AT T T T GG AAT CAT C C 
AGT G G AC AAT C AG AT T T GT CT AAG GC T C C T AAT TAT AAC C T CT T T G G C AT C AAAGG AG AA 
T AT AAAGGT AAAT CT GT CCAAAT GCCTACTTT AGAAGAT G ATGGGAAAGGCAAT AT GACT 
CAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTCTGCTTCACTATATGATTATGCT 
G AGT T AGT AT C T AG T C AAAAGT AT G CAT C T G T T T G G AAAT C AAAT AC CT C T T CT TAT AAG 
GAT G C T ACT G C AG C T CT AAC AGGT CT T T AT G C G AC AG AT AC T G C T TAT G CT AGT AAAT T A 
AAC C AAAT TAT T G AAAC C T AC AG T CT AG AT G C T T AT GAT AAA 

SEQ ID NO. 6302 

STRAIN 090 

GGGGTTTGGTTTTATAATTATAA 

AAAT GAT AAT GT C GAAC C G AC AG T C ACT AGT G CAT C GG AT C AAAC G AC GA 
C T T T T AT T C AAAC GAT T T CT C C AAC AGC T AT T G AAAT T T CT AAGAC C T AT 
GAT T T GT AT G C GT C AG T CT T AT TAG C AC AAGCT AT T T T G G AAT CAT C C AG 
T G G AC AAT C AG AT T T GT CT AAG G CT C C T AAT TAT AAC C T CT T T GGC AT C A 
AAG GAGAAT AT AAAG G T AAAT C T GT C C AAAT GC C T AC T T T AGAAGAT GAT 
GGGAAAGGCAATATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAA 
TTATTCTGCTTCACTATATGATTATGCTGAGTTAGTATCTAGTCAAAAGT 
ATGCATCTGTTTGGAAATCAAATACCTCTTCTTATAAGGATGCTACTGCA 
G C T C T AAC AGGT C T T T AT G C G AC AG AT AC T G CT TAT G C TAG T AAAT T AAA 
C C AAAT TAT T G AAAC C T AC AG T C TAG AT G C T TAT GAT AAA 

SEQ ID NO. 6303 

STRAIN A909 

GGGGTTTGGTTTTATAATTATAA 

AAAT GAT AAT GT C GAAC C G AC AGT C ACT AGTG C AT C G GAT C AAAC G AC G A 
CTTTTATTCAAACGATTTCTCCAACAGCTATTGAAATTTCTAAGACCTAT 
GATTTGTATGCGTCAGTCTTATTAGCACAAGCTATTTTGGAATCATCCAG 
T GG AC AAT C AG AT T T G T CT AAG G CT C CT AAT TAT AAC C T CT T T GG C AT C A 
AAGG AG AAT AT AAAG G T AAAT C T GT C C AAAT G C C T ACT T TAG AAG AT GAT 
GGGAAAGGCAATATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAA 
T T AT T C T G C T T C AC TAT AT GAT TAT G C T GAG T TAG TAT C T AGT C AAAAG T 
ATGCATCTGCTTGGAAATCAAATACTTCTTCTTATAAGGATGCTACTGCA 
GCTCTAACAGGTCTTTATGCGACAGATACTGCTTATGCTAGTAAATTAAA 
C C AAAT T AT T G AAAC CT AC AGT C TAG AT G C T TAT GAT AAA 
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SEQ ID NO. 6304 

STRAIN H3 6B 

GGGGTTTGGTTTTATAATTATAAAAATGATA 

AT GT CGAAC CG AC AGT C ACT AGT G CAT C GG AT C AAAC GAC G AC T T T TAT T 
C AAAC GAT T T CT C C AAC AGC T AT T G AAAT T T C T AAGAC CT AT GAT T T G T A 
T GCGT C AGT C T TAT TAG C AC AAGC T AT T T T GG AAT CAT C C AGT G GAC AAT 
CAGATTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCAAAGGAGAA 
T AT AAAG GT AAAT CT GT C C AAAT G C C TAG T T T AG AAG AT GAT GGG AAAG G 
CAATATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTCTG 
CT T C AC TAT AT GAT TAT G CT G AGTT AG T AT C T AGT C AAAAGT a T GC AT C T 
GC T T GG AAAT CAAAT ACT T C T T C T T AT AAGG AT GC T ACT G C AGCT CT AAC 
AGGT C T T TAT GC GAC AG AT AC T G C T TAT G CT AG T AAAT T AAAC C AAAT T A 
T T GAAAC C T AC AGT CT AG AT G CT TAT GAT AAA 

SEQ ID NO. 6305 

STRAIN 18RS21 

GGGGTTTGGTTT TAT AAT T AT AAAAAT GAT AAT G 

T CGAAC C GAC AG T C ACT AGT G CAT C GG AT C AAAC GAC G ACT T T T AT T C AA 
AC GAT T T C T C C AAC AG C TAT T G AAAT T T C T AAG AC CT AT GAT T T GT AT G C 
GT C AGT C T TAT TAG C AC AAG C T AT T T T G GAAT CAT C C AG T GG AC AAT C AG 
ATTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCAAAGGAGAATAT 
AAAG GT AAAT CT GT C CAAAT G C C T ACT T T AGAAG AT GAT G G G AAAGG C AA 
TATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTCTGCTT 
CACTATATGATTATGCTGAGTTAGTATCTAGTCAAAAGTATGCATCTGTT 
T GGAAAT C AAAT ACCTCTTCT TAT AAGG AT GCTACTGC AGCT CT AAC AGG 
T C T T TAT GC GAC AG AT AC T G C T T AT G C T AGT AAAT T AAAC CAAAT TAT T G 
AAAC CT AC AGT C TAG AT G CT T AT GAT AAA 

SEQ ID NO. 6306 

STRAIN M732 

GGGGTTTGGTTTTATAATTATAA 

AAAT GAT AAT G T CG AAC CG AC AG T C AC TAG T G C AT C GG AT C AAAC GAC G A 
CT T T TAT T C AAACG AT T T C T C C AAC AG CT AT T G AAAT T T C T AAGAC CT AT 
GAT T T G T AT GCGT C AGT C T T AT TAG C AC AAG C T AT T T T GGAAT CAT C C AG 
TGGACAATCAGATTTGT'CTAAGGCTCCTAATTATAACCTCTTTGGCATCA 
AAGGAGAATATAAAGGTAAATCTGTCCAAATGCCTACTTTAGAAGATGAT 
GGGAAAGGC AAT ATGACT CAAAT CCAAGCTCCTTTTCGCGCCT AT CCAAA 
T T ATT CT GCTTC ACT AT ATGATTATGCTGAGTT AGT AT CT AGT C AAAAGT 
ATGC AT CTGT T T GGAAAT C AAAT AC TTCTTCT TAT AAGG ATGCTACTGC A 
GCTCTAACAGGTCTTTATGCGACAGATACTGCTTATGCTAGTAAATTAAA 
C CAAAT TAT T G AAAC CT AC AG T C TAG AT G C T TAT GAT AAA 

SEQ ID NO. 6307 

STRAIN COH1 

GGGGTTTGGTTTTATAATTATAA 

AAAT G AT AAT GT C GAAC C GAC AGT C AC T AGT G CAT C GG AT C AAAC GAC GA 
C T T T TAT T C AAAC GAT T T CT C C AAC AG CT AT T GAAAT T T CT AAG AC C T AT 
GAT T T G TAT GCGT C AGT C T TAT T AG C AC AAG C TAT T T T GGAAT CAT C C AG 
TGGACAATCAGATTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCA 
AAGGAGAAT AT AAAGGT AAAT C T GT CCAAATGCCT ACT TTAGAA.G AT GAT 
GGGAAAGGCAATATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAA 
T TAT T C T G C T T C ACT AT AT GAT T ATG C T G AGT TAG TAT CT AGT C AAAAGT 
ATGCATCTGTTTGGAAATCAAATACTTCTTCTTATAAGGATGCTACTGCA 
GCTCTAACAGGTCTTTATGCGACAGATACTGCTTATGCTAGTAAATTAAA 
C CAAAT TAT T G AAAC C T AC AG T C TAG AT G CT TAT GAT AAA 

SEQ ID NO. 6308 

STRAIN M7 81 

GGGGTTTGGTTTT AT AAT T AT AAAAAT GA 

TAATGTCGAACCGACAGTCACTAGTGCATCGGATCAAACGACGACTTTTA 
T T C AAAC GAT T T C T C C AAC AGC TAT T G AAATT T CT AAG AC C T AT GAT T T G 
TATGCGTCAGTCTTATTAGCACAAGCTATTTTGGAATCATCCAGTGGACA 
ATCAGATTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCAAAGGAG 
AAT AT AAAGGT AAAT C T GT C CAAAT G C CT ACT T TAG AAG AT GAT G G G AAA 
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GGCAATATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTC 
T G CT T C ACT AT AT GAT TAT G C T G AGT T AGT AT CT AG T C AAAAGT AT G CAT 
CTGTTTGGAAATCAAATACTTCTTCTTATAAGGATGCTACTGCAGCTCTA 
AC AG GT CT T TAT G C GAC AGAT AC T G CT TAT G C T AGT AAAT T AAAC C AAAT 
TAT T GAAAC C T AC AGT C T AGAT G C T T AT GAT AAA 

SEQ ID NO. 6309 

STRAIN CJB110 

GGGGTTTGGTTT T AT AAT T AT AAAAAT G AT AAT GT 

C G AAC CG AC AGT C ACT AGT G CAT C G GAT C AAAC GAC G AC T T T TAT T C AAA 
CG AT T T C T C C AAC AG C TAT T G AAAT T T C T AAG AC C T AT GAT T T GT AT G CG 
T C AGT C T TAT TAG C AC AAG C T AT T T T G G AAT CAT C C AG T G GAC AAT C AGA 
T T T G T CT AAGG C T C CT AAT TAT AAC CT CT T T GGC AT C AAAGGAG AAT AT A 
AAG GT AAAT C T GT C C AAAT G C C T AC T T TAG AAG AT GAT G GG AAAGG C AAT 
ATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTCTGCTTC 
AC TAT AT GAT TAT G CT G AG T T AGT AT C T AGT C AAAAG T AT G CAT C T GT T T 
GGAAATCAAATACCTCTTCTTATAAGGATGCTACTGCAGCTCTAACAGGT 
CT T T AT G CG AC AGAT AC T G C T T AT G C TAG T AAAT T AAAC C AAAT TAT T G A 
AAC C T AC AGT C T AGAT G CT TAT GAT AAA 

SEQ ID NO. 6310 

STRAIN 1169NT 

GGGGTTTGGTTT TAT AAT TAT AAAAAT GAT AAT G T 

C G AAC AG AC AGT C AC T AGT G C AT CGG AT C AAAC GAC GACT T T TAT T C AAA 
CGATTTCCCCAACAGCTATTGAAATTTCTAAGACCTATGATTTGTATGCG 
T C AGT C T TAT TAG C AC AAG CT AT T T T G G AAT CAT C C AGT GG AC AAT C AG A 

TTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCAAAGGAGAATATA 
AAG GT AAAT CT G T C C AAAT G C C T AC T T T AG AAGAT GAT G GGAAAG G C AAT 
ATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTCTGCTTC 
ACT AT AT GAT TAT G C T G AGT T AGT AT C T AGT C AAAAG TAT G CAT C T G T T T 
GGAAATCAAATACTTCTTCTTATAAGGATGCTACTGCAGCTCTAACAGGT 
CT T T AT G C GAC AG AT AC T G C T T AT G CT AGT AAAT T AAAC C AAAT TAT T G A 
AACCTACAGTCTAGATGCTTATGATAAA 

SEQ ID NO. 6311 

STRAIN JM9130013 

T TT G GT T T TAT AAT TAT AAAAAT GAT AAT GT C G AAC C GAC AGT C AC T AGT 
G CAT C G GAT C AAAC GAC GAC T T T T AT T C AAAC GAT T T C C C C AAC AGC T AT 
T G AAAT T T C T AAG AC C T AT G AT TT G T AT G C GT C AGT C T TAT TAG C AC AAG 
C T AT T T T GG AAT CAT C C AG T GG AC AAT C AG AT T T GT C T AAG G C T C C T AAT 
TAT AAC CT CT T T GG C AT C AAAG G AG AAT AT AAAG GT AAAT C T GTT C AAAT 
GC C T ACT T T AGAAG AT GAT G GG AAAGGT AAT AT GAC C C AAAT C C AAG CT C 
CTTTTCGCGCC TAT C C AAAT TAT T C T G CT T C AC TAT AT GAT TAT GCT GAG 
TTAGTATCTAGTCAAAAGTATGCATCTGTTTGGAAATCAAATACCTCTTC 
T T AT AAGGAT G C T AC T G C AG CT CT AAC AGGT C T T T AT G C GAC AG AT AC T G 
CT TAT G CT AGT AAAT T AAAC C AAAT TAT T G AAAAC T AC AGT C TAG AT GCT 
TAT GAT AAA 

SEQ ID NO. 6312 
STRAIN 2603 frame: 1 

MKS RKKDKLVLRLTTT LL VFGLGG VW FYNYKN DNVE PT VT SASDQTTTFIQTIS PTAIE I 
SKTYDLYASVLLAQAILESSSGQSDLSKAPNYNLFGIKGEYKGKSVQMPTLEDDGKGNMT 

QIQAPFRAYPNYSASLYDYAELVSSQKYASVWKSNTSSYKDATAALTGLYATDTAYASKL 
NQI IETYSLDAYDK 

SEQ ID NO. 6313 

STRAIN 090 frame: 1 

GVWFYNYKNDNVEPTVTSASDQTTTFIQTISPTAIEISKTYDLYASVLLAQAILESSSGQ 
SDLSKAPNYNLFGIKGEYKGKSVQMPTLEDDGKGNMTQIQAPFRAYPNYSASLYDYAELV 
SSQKYASVWKSNTSSYKDATAALTGLYATDT A YASKLNQI IETYSLDAYDK 

SEQ ID NO. 6314 

STRAIN A90 9 frame: 1 

GVW FYN YKNDNVEPTVTS AS DQTTTFIQT I S PTAIEI SKTYDLYAS VLLAQAI LES S SGQ 
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SDLSKAPNYNLFGIKGEYKGKSVQMPTLEDDGKGNMTQIQAPFRAYPNYSASLYDYAELV 
SSQKYASAWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6315 

STRAIN H3 6B frame: 1 

GVW FYNYKN DNVE PT VT S AS DQTTT FI QT I S PT AI E I SKT YDL YAS VLLAQAI LE S S S GQ 
S DLS KAPN YNL FG IKGE YKGKS VQMPT LE D DGKGNMTQ I QAP FRAY PN YS AS L YD YAE LV 
SSQKYASAWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6316 

STRAIN 18RS21 frame: 1 

GVW FYNYKN DNVE PTVTS AS DQTTT FIQT I SPTAIE I SKT YDL YAS VLLAQAI LESSSGQ 
S DL S KAPN YNL FG IKGE YKGKS VQMPT LED DGKGNMTQ I QAP FRAY PNYS AS LYDYAELV 
SSQKYASVWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6317 

STRAIN M7 32 frame: 1 

GVWFYN YKNDNVE PT VT S AS DQTTT FI QT I S PT AIE I SKT YDL YAS VLLAQAI LE S S SGQ 
S D L S KAPN YN L FG I KGE YKGK S VQM P T LE D D GKGNMT Q I QAP FRAY PN Y S AS L Y D YAE L V 
SSQKYASVWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6318 

STRAIN M781 frame: 1 

GVWFYN YKNDN VE PT VT S AS DQTTT FIQT I SPTAIE I SKT YDLYAS VLLAQAI LESS SGQ 
SDLSKAPN YNL FG IKGE YKGKS VQMPTLEDDGKGNMTQI QAP FRAY PNYS AS LYDYAELV 
S S QK Y AS VW K S NT S S YK D AT AAL T G L Y AT D T A Y AS KLN QIIETYSL DAY DK 

SEQ ID NO. 6319 

STRAIN CJB110 frame: 1 

GVWFYNYKNDNVEPTVTSASDQTTTFIQTISPTAIEISKTYDLYASVLLAQAILESSSGQ 
SDLSKAPNYNLFGIKGEYKGKSVQMPTLEDDGKGNMTQIQAPFRAYPNYS AS LYDYAELV 
SSQKYASVWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6320 

STRAIN 1169NT frame: 1 

GVW FYNYKNDNVEQTVTS AS DQTTT FIQT I SPTAIE I SKT YDLYAS VLLAQAI LESS SGQ 
SDLSKAPNYNLFGIKGEYKGKS VQMPT LEDDGKGNMTQIQAPFRAYPNYS AS LYDYAELV 
SSQKYASVWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6321 

STRAIN JM9130013 frame: 3 

W FYNYKN DNVE PTVTS AS DQTTT FIQT I SPTAIE I SKT YDLYAS VLLAQAI LESSSGQSD 
LSKAPNYNLFGIKGEYKGKSVQMPTLEDDGKGNMTQIQAPFRAYPNYSASLYDYAELVSS 
QKYASVWKSNTSSYKDATAALTGLYATDTAYASKLNQIIENYSLDAYDK 

SEQ ID NO. 6401 
STRAIN 2603 

AT G AAC AAGT CT AAG AAAAT C G AAAAT TAT C AAT TAT TAT T AC T AC AAG C G C AAG C T C T A 
TTCTCAGATGAAACAAATGCTCTTGCCAACTTATCAAATGCTTCAGCTATGCTAAATGCT 
ATGCTTCCAAATTCTGTATTTACAGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTT 
GGCCCTTTCCAGGGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGT 
GAATCTGCACAAACTGCTAAGACGCTGATCGTTGATGATGTTACAAAGCATGCTAACTAT 
ATCTCCTGTGATTCAAAAGCTATGAGTGAAATCGTAGTACCTATGTTTAAAAATGGCAAA 
C T T C T AGG AGT T C TAG AT T TAG AT TCTTCTTT AGT AG C AG AT TAT GAT GAG AT T GAT C AA 
GAATACTTAGAAAAATTTGTAGGTATTCTAGTAGAACATACGATTTGGAATTTGGATATG 
TTTGGAGTTGAAAAG 

SEQ ID NO. 6402 

STRAIN 090 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTTA 
TCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTAC 
AGGCTTTTATTTATTTGATGGAAAGGAGTTAATTCTTGGCCCTTTCCAGG 
GTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGAA 
T CT G C AC AAAC T G C T AAGAC G C T GAT T GT T GAT GAT GT T AC AAAG CAT G C 
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T AAC T AT AT C T C CT GT GAT T C AAAAG CT AT GAG T GAAAT C GT AGT AC C T A 
T GT T T AAAAAT GG C AAACT T CT AG GAGT T CT AG AT T T AGAT T C T T C T T T A 
GT AG C AG AT TAT GATG AG AT T GAT C AAG AAT AC T T AG AAAAAT T T GT AG G 
TATTCTAGTAGAACATACGATTTGGAATTTGGATA 

SEQ ID NO. 6403 

STRAIN A90 9 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAA 

CTTATCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTAT 
TTACAGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTTGGCCCTTTC 
CAGGGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGG 
T G AAT C T G C AC AAAC T G CT AAG ACG CT GAT CGT T GAT G AT GT T AC AAAGC 

ATGCTAACTATATCTCCTGTGATTCAAAAGCTATGAGTGAAATCGTAGTA 
CCTATGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTC 
T T T AGT AG C AGAT TAT GAT GAG AT T GAT C AAG AAT AC T T AGAAAAAT T T G 

TAGGTATTCTAGTAGAACATACGATTTGGAATTTGGATATGTTTGGAGTT 
GAAAAG 

SEQ ID NO. 6404 

STRAIN H3 6B 

CTCTATTCTCAGATGAAACAAATGCTCTTGC 

CAACTTATCAAATGCTTCAGCTATGCTAAaTGCTATGCTTCCAAATTCTG 
TATTTACAGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTTGGCCCT 
TTCCAGGGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTG 
TGGTGAATCTGCACAAACTGCTAAGACGCTGATCGTTGATGATGTTACAA 
AG CAT G CT AAC T AT AT C T C C T GT G AT T C AAAAG CT AT GAGT GAAAT CGT A 

GTACCTATGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTC 
T T CT T T AGT AG C AGAT TAT GAT G AGAT T GAT C AAG AAT AC T TAG AAAAAT 

TTGTAGGTATTCTAGTAGAACATACGATTTGGAATTTGGATATGTTTGGA 
GTT GAAAAG 

SEQ ID NO. 6405 

STRAIN 18RS21 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTT 

ATCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTA 
CAGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTTGGCCCTTTCCAG 
GGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGA 
AT CTGC AC AAACT GCT AAG ACGCTGAT CGT TGATGATGTT AC AAAGC ATG 
C T AACT AT AT CT C C T G T GAT T C AAAAG CT AT GAGT GAAAT C GT AGT AC C T 

ATGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTCTTT 
AGT AG C AG AT TAT GAT GAG AT T GAT C AAGAAT AC T T AGAAAAAT T T G T AG 

GTATTCTAGTAGAACATACGATTTGGAATTTGGATATGTTTGGAGTTGAA 
AAG 

SEQ ID NO. 6406 

STRAIN M732 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTT 
ATCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTA 
CAGGCTTTTATTTATTTGATGGAGAGGAGTTAATTCTTGGCCCTTTTCAG 
GGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGA 
ATCTGCACAAACTGCTAAGACGCTGATTGTTGATGATGTTACAAAGCATG 
CT AACT AT ATCTCCTGTGATTCAAAAGCT AT GAGT GAAAT CGT AGT ACCC 
AT GTT TAAAAATGGCAAACTTCTAGG AGT TCT AGAT TT AGAT TCTTCTTT 
AGT AG C AGAT TAT GAT GAG AT T GAT C AAG AAT AC T T AGAAAAAT T T GT AG 
GT AT T C T AGT AG AAC AT AC GAT T T GG AAT T T G GAT AT G T T T G G AGT T G AA 
AAG 

SEQ ID NO. 6407 

STRAIN COH1 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAAC 

TTATCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATT 
TACAGGCTTTTATTTATTTGATGGAGAGGAGTTAATTCTTGGCCCTTTTC 
AGGGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGT 
GAAT C T G C AC AAAC T G CT AAG AC GCT GATT GTT GAT GAT G T T AC AAAG C A 
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TGCTAACTATATCTCCTGTGATTCAAAAGCTATGAGTGAAATCGTAGTAC 
C CAT GT T T AAAAAT G G C AAAC T T CT AGG AGT T C TAG AT T TAG AT T C T T CT 
T T AGT AG C AG AT TAT GAT G AG AT TG ATC AAG AAT AC T T AG AAAAAT T T GT 
AG GT AT T C T AGT AG AAC AT ACG AT T T G GAAT T T G GAT ATGT TT G GAGT T G 
AAAAG 

SEQ ID NO. 6408 

STRAIN M7 81 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTT 
ATCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTA 
CAGGCTTTTATTTATTTGATGGAGAGGAGTTAATTCTTGGCCCTTTTCAG 
GGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGA 
AT CT G C AC AAAC T G C T AAG AC G C T GAT T GT T GAT GAT G T T AC AAAG CAT G 
C T AACT AT AT C T C C T GT GAT T C AAAAG C TAT GAGT G AAAT CGT AGT AC C C 
ATGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTCTTT 
AGT AG C AG AT TAT GAT GAG AT T GAT C AAG AAT ACT TAG AAAAAT T T G T AG 

GTATTCTAGTAGAACATACGATTTGGAATTTGGATATGTTTGGAGTTGAA 
AAG 

SEQ ID NO. 6409 

STRAIN CJB110 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTTA 

TCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTAC 

AGGCTTTTATTTATTTGATGGAAAGGAGTTAATTCTTGGCCCTTTCCAGG 

GTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGAA 

TCTGCACAAACTGCTAAGACGCTGATTGTTGATGATGTTACAAAGCATGC 

T AACT AT AT C T C C T G T GAT T C AAAAG C TAT GAGT G AAAT C GT AGT AC C T A 

TGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTCTTTA 
GT AG C AG AT TAT GAT G AGAT T GAT C AAG AAT AC T TAG AAAAAT T T GT AGG 
TAT T CT AGT AG AAC AT AC GAT T T GG AAT T T G GAT AT GT T T G G AGT T G AAA 



SEQ ID NO. 6410 

STRAIN 1169NT 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTTA 
TCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTAC 
AGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTTGGCCCTTTCCAGG 
GTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGAA 
T CT G C AC AAACT G C T AAG AC G C T GAT T GT T GAT G AT GT T AC AAAG CAT G C 
T AAC TAT AT C T C C T GT GAT T C AAAAG CT AT G AGT GAAAT CGT AGT AC C C A 
TGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTCTTTA 
GT AG C AGAT TAT GAT GAG AT T GAT C AAG AAT AC T TAG AAAAAT T T GT AGG 
TAT T CT AG TAG AAC AT AC GAT T T GG AAT T T G GAT AT GT T T G GAGT T G AAA 
AG 

SEQ ID NO. 6411 

STRAIN JM9130013 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTTA 

TCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTAC 

AGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTTGGCCCTTTCCAGG 

GTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGAA 

TCTGCACAAACTGCTAAGACGCTGATCGTTGATGATGTTACAAAGCATGC 

T AAC TAT AT C T C CT GT GAT T C AAAAG CT AT GAG T GAAAT CGTAGTACCTA 

TGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTCTTTA 
G TAG C AG AT TAT GAT GAG AT T GAT C AAG AAT ACT T AGAAAAAT T T G T AGG 
TAT T C T AGT AG AAC AT AC GAT T T GG AAT T T G GAT AT G T T T G GAGT T G AAA 
AG 

SEQ ID NO. 6412 
STRAIN 2603 frame: 1 

MNKSKKIENYQLLLLQAQALFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELIL 
GPFQGGVSCVHITLGKGVCGESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGK 
LLGVLDLDSSLVADYDEIDQEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6413 
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STRAIN 090 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGKELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIWPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLD 

SEQ ID NO. 6414 

STRAIN A90 9 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIWPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6415 

STRAIN H3 6B frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIWPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6416 

STRAIN 18RS21 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIWPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6417 

STRAIN M732 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QE YLEKFVG I LVEHT IWNL DMFGVE K 

SEQ ID NO. 6418 

STRAIN COH1 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIWPMFKNGKLLGVLDLDSSLVADYDEID 
QE YLEKFVG I LVEHT IWNLDM FGVE K 

SEQ ID NO. 6419 

STRAIN M781 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6420 

STRAIN M7 81 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6421 

STRAIN CJB110 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGKELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6422 

STRAIN 1169NT frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6423 

STRAIN JM9130013 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QE YLE KFVG I LVEHT I WNL DM FGVEK 
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SEQ ID NO. 6501 
STRAIN 2603 

ATGAAAAAGAGTACCCAAATAATACTACTAATAGTTGCA 

T TAT T CAT ACT T GT T T T T AGC G GAG GAT T T TAT AT G AAAG AAC AAC AAAG AAAAGAAG AA 
CTAAAACGGAATCGAGAATATGAAGTTAGTCTAGTCAAAGCATTGAAAAATTCCTATGAG 
AATATAGAAGAAATAAAAATCACACATCCTGTTTCAACTGAAATTCCTGGAGATTGGCAT 
TGTACTGTAAAGATTTCATTTAATGATAAAAAATCTATTGTTTATAATATTACACATAAT 
T T G GAAT C GAAAAAAAAT T AT AG CGGAAAATT T AAT GAAAAAAAT AT G AAT T T T T T T GAT 
T C AAG AAT T GG T AAAAC AAAAAAAAC T AT AAAAAT T AT T TT T TC AGAT GG T C AGG AGAAG 
ATACAA 

SEQ ID NO. 6502 

STRAIN 090 

GGAGGATTTTATATGAAAGAACA 

AC AAAG AAAAGAAG AAC T AAAAC GG AAT C GAG AAT AT G AAGT TAG T C T AG 
T CAAAG CAT T GAAAAAT T C C TAT GAG AAT AT AG AAGAAAT AAAAAT C AC A 
CATCCTGTTTCAACTGAAATTCCTGGAGATTGGCATTGTACTGTAAAGAT 
T T CAT T T AAT GAT AAAAAAT C T AT T GT T TAT AAT AT T AC AC AT AAT T T G G 
AAT C GAAAAAAAAT TAT AG C G G AAAT T T T AAT GAAAAAAAT AT GAAT T T T 
T T T GAT T C AAG AAT T GG T AAAAC AAAAAAAACT AT AAAAAT TAT T T T T T C 
AG At GG t C AGG AG AAG AT a C AA 

SEQ ID NO. 6503 

STRAIN A909 

GG AGG AT TT TAT AT G AAAGAAC AAC AA 

AG AAAAGAAGAACT AAAAC G GAAT CG AG AAT AT G AAG T T AG T C T AGT C AA 
AG CAT T GAAAAAT T C C TAT GAGAAT AT AG AAG AAAT AAAAAT C AC AC AT C 
CTGTTTCAACTGAAATTCCTGGAGATTGGCATTGTACTGTAAAGATTTCA 
TT TAAT GAT AAAAAAT CT ATT GT TTAT AAT ATT AC ACAT AAT TT GGAAT C 
GAAAAAAAAT TATAG CGGAAAATT TAAT GAAAAAAAT AT GAAT TTTTTTG 
ATT CAAGAAT TGGT AAAAC AAAAAAAACT AT AAAAATTATTTT T T C AGAT 
GG t C AGG AG AAG AT AC AA 

SEQ ID NO. 6504 

STRAIN H3 6B 

GGAGGATTTTATATGAAAGAACA 

AC AAAG AAAAGAAG AAC T AAAACGG AAT C GAG AAT AT G AAG T T AGT C T AG 
T CAAAGCATT GAAAAATT CCT ATGAGAAT AT AGAAGAAAT AAAAAT CACA 
CAT C C T GT T T C AAC T G AAAT T C C T GG AG AT T GG C AT T GT AC T GT AAAG AT 
TT CAT TTAAT GAT AAAAAAT CT AT TGTTT AT AAT ATT AC ACAT AAT TTGG 
AAT C GAAAAAAAAT TATAG C G G AAAAT T TAAT GAAAAAAAT AT GAAT T T T 
TTTGATTCAAGAATTGGTAAAACAAAAAAAACTATAAAAATTAtTTTTTC 
AGAT G G t C AGG AGAAG AT a C AA 

SEQ ID NO. 6505 

STRAIN 18RS21 

GG AGG AT TT T ATAT GAAAGAACAAC 

AAAGAAAAGAAGAACTAAAACGGAATCGAGAATATGAAGTTAGTCTAGTC 
AAAGCATT GAAAAATT CCTATGAGAAT AT AGAAGAAAT AAAAAT CACAC A 
TCCTGTTTCAACTGAAATTCCTGGAGATTGGCATTGTACTGTAAAGATTT 
CAT T TAAT GAT AAAAAAT CTATTGT TTAT AAT AT TACACATAATTT GGAA 
T CGAAAAAAAAT TAT AGC GGAAAATTTAAT GAAAAAAAT AT GAATT TT T T 
T GAT TCAAG AAT TGGT AAAAC AAAAAAAACT AT AAAAAT T ATT TTT T C AG 
AT GG t CAG G AG AAG AT a CAA 

SEQ ID NO. 6506 

STRAIN M781 

G GAG GAT T T TAT AT G AAAG AAC AAC AAAG AAAA 

GAAGAACT AAAACGGAAT CGAGAATATGAAGT T AGT CT AGT CAAAG CAT T 
GAAAAAT T CCT AT GAG AAT AT AG AAGAAAT AAAAAT CACAC AT CCT GTTT 
CAACTGAAATTCCTGGAGATTGGCATTGTACTGTAAAGATTTCATTTAAT 
GAT AAAAAAT CT AT T G T T TAT AAT AT T AC AC AT AAT T TG G AAT C G AAAAA 
AAAT TAT AGC G G AAAAT T T AAT GAAAAAAAT AT GAAT T T T T T T GAT T CAA 
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GAATTGGTAAAACAAAAAAAACTATAAAAATTATTTTTTCAGATGGTCAG 
GAGAAGATACAA 

SEQ ID NO. 6507 

STRAIN CJB110 

G G AGGAT T T TAT AT G AAAG AAC AAC AAAG AAAAG AAG AA 
CT AAAACG GAAT CG AG AAT AT G AAGT T AGT CT AGT C AAAG CAT T GAAAAA 
T T C CT AT G AGAAT AT AGAAG AAAT AAAAAT C AC AC AT C CT GT T T C AAC T G 
AAAT T C CT GG AG AT T GGC AT T GT ACT G T AAAG AT T T CAT T T AAT GAT AAA 
AAATCTATTGTTTATAATATTACACATAATTTGGAATCGAAAAAAAATTA 
TAG C G G AAAT T T T AAT G AAAAAAAT AT GAAT T T T T T T GAT T C AAG AAT T G 
GTAAAACAAAAAAAAC TAT AAAAAT TAT T T T T T C AG AT G GT C AGG AG AAG 
ATACAA 

SEQ ID NO. 6508 

STRAIN 1169NT 

GGAGGAT TT T AT ATGAAAGAAC AACAAAG 

AAAAG AAG AAC TAAAAC G GAAT C GAG AAT AT GAAG T TAG T CT AG T C AAAG 
C ATTGAAAAAT T CCTATGAGAAT AT AGAAGAAAT AAAAAT C AC ACAT C CT 
GTTTCAACTGAAATTCCTGGAGATTGGCATTGTACTGTAAAGATTTCATT 
TAATGATAAAAAATCTATTGTTTATAATATTACACATAATTTGGAATCGA 
AAAAAAAT T AT AGTGGAAAATTTAATG AAAAAAAT AT GAATTTTTTTGAT 
T CAAGAATT GGTAAAACAAAAAAAAC TAT AAAAAT T ATT TT T T CAGAT GG 
T C AGG AG AAG AT AC AA 

SEQ ID NO. 6509 

STRAIN JM9130013 
GGAGGATT T TAT AT GAAAGAACAAC 

AAAG AAAAG AAGAAC T AAAACG GAAT CG AG AAT AT G AAGT TAG T C T AG T C 
AAAGC ATT GAAAAATT CCTATG AGAAT AT AGAAGAAAT AAAAAT CACACA 
TCCTGTTTCAACTGAAATTCCTGGAGATTGGCATTGTACTGTAAAGATTT 
CAT TT AAT GAT AAAAAAT CTAT T GT TTAT AATAT TACACAT AATTT GGAA 
T CGAAAAAAAATT AT AGCGGAAAATTTAATG AAAAAAAT AT GAAT T T T TT 
T GAT T C AAG AAT T GGTAAAACAAAAAAAAC TAT AAAAAT TAT T T T T T C AG 
At GGtC AGG AG AAG ATACAA 

SEQ ID NO. 6510 
STRAIN 2 603 frame: 1 

MKKS T Q 1 1 LL I VAL F I LVFS GG FYMKE QQRKEE LKRNRE YE V S LVKALKN S YEN I EE I K I 
THPVSTEIPGDWHCTVKISFNDKKSIVYNITHNLESKKNYSGKFNEKNMNFFDSRIGKTK 
KT IKI I FS DGQEKIQ 

SEQ ID NO. 6511 

STRAIN 0 90 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGD 

WHCTVKISFNDKKSIVYNITHNLESKKNYSGNFNEKNMNFFDSRIGKTKKTIKIIFSDGQ 
EKIQ 

SEQ ID NO. 6512 

STRAIN A909 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGDWH 

CTVKI S FNDKKS I VYNITHNLE SKKNYSGKFNEKNMN FFDSRI GKTKKTIKI I FS DGQEK 

IQ 

SEQ ID NO. 6513 

STRAIN H3 6B 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGD 

WHCTVKI S FNDKKS I VYNITHNLESKKNYSGKFNEKNMNFFDSRIGKTKKT IKI I FS DGQ 

EKIQ 

SEQ ID NO. 6514 

STRAIN 18RS21 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGDW 
HCTVKISFNDKKSIVYNITHNLESKKNYSGKFNEKNMNFFDSRIGKTKKTIKIIFSDGQE 
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KIQ 

SEQ ID NO. 6515 

STRAIN CJB110 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGDWHCTVK 
ISFNDKKSIVYNITHNLESKKNYSGNFNEKNMNFFDSRIGKTKKTIKIIFSDGQEKIQ 

SEQ ID NO. 6516 

STRAIN JM9130013 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGDW 

HCTVKI S FNDKKS IVYNITHNLESKKNYSGKFNEKNMNFFDSRIGKTKKT IKI I FS DGQE 

KIQ 

SEQ ID NO. 6517 

STRAIN 1169NT frame: 1 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGDWHCTVKISF 
NDKKSIVYNITHNLESKKNYSGKFNEKNMNFFDSRIGKTKKTIKIIFSDGQEKIQ 

SEQ ID NO. 6518 

STRAIN M7 81 frame: 1 

GGFYMKEQQRKEELKRNREYEVSLVECALKNSYENIEEIKITHPVSTEIPGDWHCTVKISF 
NDKKS I VYN I THNLE SKKN YSGKFNEKNMNFFDSRI GKTKKT IKI I FS DGQEKIQ 

SEQ ID NO. 6601 
STRAIN 2603 

T T G AC AAG GC AT AT AAAAAT T T C T AT AC T AAAT T T AC AAAAT G AAGG AG AGG G AACT AT G 
G AAAT AC T GAT T G C AGGT G G T AGT GGT T T T T T AGG AAAG C AG AT AAT AAAAG C AG CG C T T 
ACAAAAGGGCATAAAGTGGCTTACTTATCAAGACATGAAGGTAAAGGTGATATATTTAAG 
GAT CCT AGATT AAC CTAC ATT AGGGGAGAT ATT AC AGAAG CT GAT AAGATT CATT TAGAA 
GAC AG AAC T T T T GAT AT AT T AAT T G AC T GT AT T G GAG C GAT T AAG C C C AAT C AAC TAG AT 
G AG CT T AACGT T AAAGC AAC C C AAAAAG C AGT AGC ACT C T GT C AC AAAAAT C AAAT AC C A 
AAGTTAGTTTATATTTCAGCCAACAGCGGCTATTCAGCTTACATTAAAAGTAAAAGGAAG 
G C AG AG C AGAT AAT C AAAG CAAG C GG T C TGG AT TAT CT T T T T G T AAG AC C AG G T T T GATG 
TATGGTGAAGAGCGACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCAT 
TTGCCTTTCTTAGGTATTGTTGTACAAAAGGTCTTTCCAACTAAGGTTGTGATAGTGGCA 
G AAG C AAT CGT TAG T AC G C T T AGGAAAAAAC C AAC C C AAAAAAT C CT T T C T AT T G AAGAA 
TT AAAT AAT AAA 

SEQ ID NO. 6602 
STRAIN 090 

AC AAG G CAT AT AAAAAT T T CT AT ACT AAAT T T AC AAAAT 
GAAGGAGAGGGAACTATGG AAAT ACTG ATT GCAGGTGGT AGT GGT TTTTT 
AGG AAAG C AG AT AAT AAAAG C AG C G C T T AC AAAAGGG C AT AAAGT GG C T T 
ACT TAT CAAG AC AT G AAG GT AAAGGT G AT AT AT T T AAG GAT C CT AG AT T A 
ACCTACATTAGGGGAGATATTACAGAAGCTGATAAGATTCATTTAGAAGA 
C AG AACT T T T GAT AT AT T AAT T G ACT GT AT T G G AG CG AT T AAG C C C AAT C 
AAC TAG AT G AG CT T AAC GT T AAAG C AAC C C AAAAAG C AGT AG C AC T C T GT 
C AC AAAAAT C AAAT AC C AAAGT T AGT T TAT AT T T C AG C C AAC AG C GG CT A 
T T C AG CT T AC AT T AAAAG TAAAAGG AAG G C AG AG C AG AT AAT C AAAG C AA 
GCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGAAGAG 
CGACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCATTT 
GCCTTTCTT AGGT ATT GTTGTACAAAAGGTCTTTCCAACTAAGGTTGTGA 
T AGT GG C AG AAG C AAT CGT T AC T AC G CT T AGG AAAAAAC C AAC C C AAAAA 
AT C CTTT CT ATT GAAGAAT T AAAT AAT AAA 

SEQ ID NO. 6603 
STRAIN A909 

AC AAGGC AT AT AAAAAT TTCT AT ACT AAAT TT AC AAAAT G 

AAG GAG AG GG AACT AT G G AAAT ACT GAT T GC AGGT G G T AG TGGTTTTTTA 

GGAAAGCAGATAATAAAAGCAGCGCTTACAAAAGGGCATAAAGTGGCTTA 

CTTATCAAGACATGAAGGTAAAGGTGATATATTTAAGGATCCTAGATTAA 

CC T AC AT T AG G G GAG AT AT T AC AG AAG C T GAT AAG AT T CAT T TAG AAG AC 

AGAACTTTTGATATATTAATTGACTGTATTGGAGCGATTAAGCCCAATCA 

ACT AGATGAGCTT AACGT T AAAGC AACCCAAAAAGC AGT AGC ACT CTGTC 
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AC AAAAAT C AAAT AC C AAAGT T AGT T TAT AT T T C AG C C AAC AGC G G CT AT 
T C AGCT T AC AT T AAAAGT AAAAGGAAGG C AG AGC AGAT AAT C AAAG C AAG 
CGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGAAGAGC 
GACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCATTTG 
CCTTTCTTAGGTATTGTTGTACAAAAGGTCTTTCCAACTAAGGTTGTGAT 
AGT GG C AG AAGC AAT C GT T AC T ACGC T T AGG AAAAAAC CAAC C C AAAAAA 
T C CT T T C TAT T G AAG AAT T AAAT AAT AAA 

SEQ ID NO. 6604 
STRAIN H36B 

TAT AAAAAT T T C TAT ACT AAAT T T AC AAAAT G AAG GAGAG GG AAC TAT G G 
AAATACTGATTGCAGGTGGTAGTGGTTTTTTAGGAAAGCAGATAATAAAA 
G C AG C G CT T AC AAAAG GG C AT AAAG T GG CT T AC T TAT C AAG AC AT G AAG G 
TAAAGGTGATATATTTAAGGATCCTAGATTAACCTACATTAGGGGAGATA 
T T AC AG AAG C T GAT AAG AT T CAT T T AG AAGAC AGAACT T T T GAT AT AT T A 
ATT GAC T GT AT T GGAG CG AT T AAG C C C AAT CAAC TAG AT GAG C T T AAC G T 
T AAAG CAAC C C AAAAAG C AGT AG C AC T CT G T C AC AAAAAT C AAAT AC C AA 
AGT T AGT T TAT AT T T C AG C CAAC AG CG G CT AT T C AGC T T AC AT T AAAAG T 
AAAAGG AAG G C AG AGC AG AT AAT C AAAG C AAGC G G T C T G GAT TAT C T T T T 
TGTAAGACCAGGTTTGATGTATGGTGAAGAGCGACCTCTCTCGATTTTCC 
AAGCCAAGTGTATAAAGTTATTTAGTCATTTGCCTTTCTTAGGTATTGTT 
GTACAAAAGGTCTTTCCAACTAAGGTTGTGATAGTGGCAGAAGCAATCGT 
TACT ACGCT TAGG AAAAAAC CAACC CAAAAAAT C CTTT CT ATT GAAGAAT 
TAAATAATAAA 

SEQ ID NO. 6605 
STRAIN 18RS21 

ACAAGGCAT ATAAAAATT T CT AT ACT AAAT TT AC AAAAT 
GAAGGAGAGGGAACTATGGAAATACTGATTGCAGGTGGTAGTGGTTTTTT 
AG G AAAGC AG AT AAT AAAAG C AGC G C T T AC AAAAG G G CAT AAAGT G G C T T 
ACT T AT CAAGAC ATGAAGGT AAAGGTGAT AT ATT T AAGGAT CCT AGATTA 
ACCTACATTAGGGGAGATATTACAGAAGCTGATAAGATTCATTTAGAAGA 
C AG AACT T T T GAT AT AT T AAT T G ACT G TAT T G G AG C GAT T AAG C C C AAT C 
AAC TAG AT G AG CT T AAC G T T AAAGC AAC C C AAAAAG C AGT AGC AC T C T GT 
CACAAAAATCAAATACCAAAGTTAGTTTATATTTCAGCCAACAGCGGCTA 
T T C AG C T T AC AT T AAAAG T AAAAG G AAG G C AG AGC AGAT AAT C AAAG C AA 
GCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGAAGAG 
CGACCTCT CT CGAT TTTCC AAGC CAAGTGT AT AAAGT T AT TT AGT CATTT 
GCCTTTCTTAGGTATTGTTGTACAAAAGGTCTTTCCAACTAAGGTTGTGA 
TAGTGGCAGAAGCAATCGTTACTACGCTTAGGAAAAAACCAACCCAAAAA 
AT C CT T T CT AT T G AAG AAT T AAAT a AT AAA 

SEQ ID NO. 6606 
STRAIN M732 

C AAAAT G AAG G AG Ag GG AAC T AT G g AAAT AC T GAT T G C AG GT G GT AG T GG 
T T T T C TAG G G AAG C AG AT AAT AAAAG C AG C G CT T AC AAAAG G G C AT AAGG 
T G G CT TACT TAT C AAGG C AT GAAG GT AAAG GT G AT AT AT T T AAG GAT C c T 
AGATT AAC CT ACATT AAGGGAGAT AT T ACAGAAGCT GAT AAGAT T C AT TT 
AG a AC AT AG AAAT T T T GAT AT AT T AAT T GAC T GT AT T GGAG C GAT T AAG C 
C C AAT CAAC TAG AT GAG C T T AAC G T T AAAG CAAC C C AAAAAG C AG TAG C A 
CT CT GT CACAAAAAT C AAAT ACC AAAGT TAGTTT ACATT T CAGCCAAT AG 
C G G CT AT T C AG C T T AC AT T AAAAG T AAAAG GAAG G C AG AG C AG AT AAT C A 
AAGCAAGCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGT 
GAAG AG CG AC C T C T CT C GAT T T T C C AAG C C AAGT G TAT AAAAT TAT T TAG 
TCATTTGCCTTTCTTAGGTATTGTTGTACAAAAAGTCTTTCCAACTAAGG 
TTGTGATAGTGGCAGAAGCAATCGTTACTTCGCTTAGGAAAAAACCAACT 
CAAAAAAT CCTTT CT AT T GAAGAATT AAAT AAT AAA 

SEQ ID NO. 6607 
STRAIN COH1 

ACAAGGCAT AT AAAAAT TTCTAT ACT AAATTT AC 

AAAAT GAAG G AG AGG GAAC T AT GG AAAT AC T GAT T G C AG G T G G T AGT G GT 
T T T CT AG G GAAG C AG AT AAT AAAAG C AG CG C T T AC AAAAG G G CAT AAG GT 
GGCTTACTTATCAAGGCATGAAGGTAAAGGTGATATATTTAAGGATCCTA 
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GATTAACCTACATTAAGGGAGATATTACAGAAGCTGATAAGATTCATTTA 
G AAC AT AGAAAT T T T GAT AT AT T AAT T G ACT GT AT T G GAG C GAT T AAG C C 
C AAT C AAC TAG AT GAGCT T AAC GT T AAAG C AAC C C AAAAAG C AGT AG C AC 
TCTGTCACAAAAATCAAATACCAAAGTTAGTTTACATTTCAGCCAATAGC 
GGCTATTCAGCTTACATTAAAAGTAAAAGGAAGGCAGAGCAGATAATCAA 
AGCAAGCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTG 
AAG AG C G AC CT CT C T C GAT T T T C C AAG C C AAGT GT AT AAAAT T AT T T AGT 
CATTTGCCTTTCTTAGGTATTGTTGTACAAAAAGTCTTTCCAACTAAGGT 
TGTGATAGTGGCAGAAGCAATCGTTACTTCGCTTAGGAAAAAACCAACTC 
AAAAAAT C CTT T CT AT T GAAGAATT AAAT AAT AAA 

SEQ ID NO. 6608 
STRAIN M7 81 

ACAAGGCATATAAAAATTTcTATACTAAATTTaCA 

AAAT G AAGG AG AG G G AACT AT G G AAAT AC T GAT T G C AGG T GGT AGT GG T T 
T T CT AG G G AAG C AG AT AAT AAAAG C AG C G C T T AC AAAAG GG C AT AAG GT G 
G C T T AC T TAT C AAG G CAT G AAGGT AAAG GT GAT AT AT T T AAG GAT C C T AG 
ATTAACCTACATTAAGGGAGATATTACAGAAGCTGATAAGATTCATTTAG 
AACATAGAAATTTTGATATATTAATTGACTGTATTGGAGCGATTAAGCCC 
AATCAACTAGATGAGCTTAACGTTAAAGCAACCCAAAAAGCAGTAGCACT 
CTGTCACAAAAATCAAATACCAAAGTTAGTTTACATTTCAGCCAATAGCG 
G CT AT T C AG C T T AC AT T AAAAG TAAAAGG AAG G C AG AG C AGAT AAT C AAA 
GCAAGCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGA 
AG AG C G AC CT CT C T CG AT T T T C C AAG C C AAGT GT AT AAAAT TAT T T AGT C 
ATTTGCCTTTCTTAGGTATTGTTGTACAAAAAGTCTTTCCAACTAAGGTT 
GTGATAGTGGCAGAAGCAATCGTTACTTCGCTTAGGAAAAAACCAACTCA 
AAAAAT CCTTTCTAtT G AAGAAT T AAAT AAT AAA 

SEQ ID NO. 6609 
STRAIN 1169NT 

ACAAGGCATATAAAAATTTCTATACTAAATTTACAAA 
AT G AAG G AG AGGG AAC T AT GG AAAT AC T GAT T G C AGGT G GT AGT GGT T T T 
T TAG G AAAG C AG AT AAT AAAAGC AG C GC T T AC AAAAG G G CAT AAGT T G G C 
TTACTTATCAAGACATGAAGGTAAAGGTGATATATTTAAGGATCCTAGAT 
T AACCTAC AT T AAGGGAGATAT TACAGAAGCT GAT AAG ATT CAT T T AGAA 
G AC AG AAC T T T T GAT AT AT T AAT T G AC T GT AT T GG AG C G AT T AAG C C C AA 
T C AAC TAG AT GAG C T T AAC GT T AAAG C AAC C C AAAAAG C AGT AGC AC T C T 
GT C AC AAAAAT C AAAT AC C AAAGT T AGT T T AC AT T T C AG C C AAC AG C G G C 
TAT T CAGCTT AC AT T AGAAGTAAAAGGAAGGCAGAGC AGAT AAT CAAAGC 
AAGCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGAAG 
AG C G AC C T C T CT CG AT T T T C C AAG C C AAGT GT AT AAAAT TAT T T AGT CAT 
TTGCCTTT CTT AGGT ATTGTTGTACAAAAGGT CTT TCCAACT AAGGT TGT 
GAT AGT G G C AG AAG C AAT C GT T AC T AC G CT T AGGAC AAAAC C AAC T C AAA 
AAATCCTTTCTATTGAAGAATTAAATAATAAA 

SEQ ID NO. 6610 
STRAIN CJB110 

AC AAG G CAT AT AAAAAT TT CT AT ACT AAAT TT AC AAA 
ATGAAGGAGAGGGAACTATGGAAATACTGATTGCAGGTGGTAGTGGTTTT 
T T AG G AAAG C AG AT AAT AAAAG C AG C G CT T A C AAAAG G G CAT AAAG T G G C 
T T AC T T AT C AAGAC AT G AAG GT AAAGGT GAT AT AT T T AAG GAT C CT AG AT 
TAACCTACATTAGGGGAGATATTACAGAAGCTGATAAGATTCATTTAGAA 
GACAGAACTTTTGATATATTAATTGACTGTATTGGAGCGATTAAGCCCAA 
T C AACT AG AT G AG CT T AAC G T T AAAG C AAC CC AAAAAG C AGT AG C AC T C T 
GT CAC AAAAAT C AAAT AC C AAAGT TAGTT TAT AT TTCAGCC AAC AG CGGC 
TAT T C AG C T T AC AT T AAAAG T AAAAG G AAGGC AG AG C AG AT AAT C AAAG C 
AAGCGGTCTGGATT AT CTTTTTGT AAGAC CAGGTTTGATGTATGGTGAAG 
AGCGACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCAT 
TT GC CT T T CTT AGGT AT TGTTGTACAAAAGGTCTT TCCAACT AAGGT TGT 
GAT AG T GG C AG AAG C AAT C GT T AC T ACG CT T AGG AAAAAAC C AAC C C AAA 
AAAT CCTTTCT AT TGAAGAATT AAAT AAT AAA 

SEQ ID NO. 6611 
STRAIN JM9130013 
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AC AAG GC AT AT AAAAAT T T C T AT AC T AAAT T T AC AAAAT G 
AAGGAGAGGGAACTATGGAAATACTGATTGCAGGTGGTAGTGGTTTTTTA 
GG AAAG C AG AT AAT AAAAG C AG C G C T T AC AAAAGGGC AT AAAGT GG CT T A 
CT T AT C AAGAC AT GAAGGT AAAGG T GAT AT AT T T AAG GAT C CT AGAT T AA 
C C T AC AT TAG G GGAGAT AT T AC AG AAG C T G AT AAGAT T CAT T T AGAAG AC 
AGAACTTTTGATATATTAATTGACTGTATTGGAGCGATTAAGCCCAATCA 
ACT AGAT GAG C T T AAC GT T AAAG C AAC C C AAAAAG C AGT AG C ACT CT GT C 
ACAAAAATCAAATACCAAAGTTAGTTTATATTTCAGCCAACAGCGGCTAT 
T C AG C T T AC AT TAAAAGT AAAAG G AAGG C AGAG C AG AT AAT C AAAG C AAG 
CGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGAAGAGC 
GACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCATTTG 
CCtTTCTTAgGTATTGTTGTACAAAAGGTCTTTCCAACTAAGGTTGTGAT 
AGT G G C AGAAGC AAT C GT TACT AC G CT TAG G AAAAAAC C AAC C C AAAAAA 
TCCTTTCTATTGAAGAATTAAATAATAAA 

SEQ ID NO. 6612 
STRAIN 2 603 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6613 

STRAIN 0 90 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLGIWQKVFPTKVVIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6614 

STRAIN A90 9 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLG I VVQKVFPTKVVI VAEAIVTTLRKKPTQKI LS I EE LNNK 

SEQ ID NO. 6615 

STRAIN H3 6B frame: 2 

IKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKDPRL 
TYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPKLVY 
ISANSGYSAYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHLPFL 
GIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6616 

STRAIN 18RS21 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6617 

STRAIN M732 frame: 1 

QNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKDPRLTYIKGDIT 
EADKIHLEHRNFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPKLVYISANSGYS 
AYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHLPFLGIVVQKVF 
PTKWIVAEAIVTSLRKKPTQKILSIEELNNK 

SEQ ID NO. 6618 

STRAIN COH1 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIKGDITEADKIHLEHRNFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
P FLG I VVQKV FPTKWI VAE AI VT S LRKKPT QKI LS I EE LNNK 
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SEQ ID NO. 6619 

STRAIN M781 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIKGDITEADKIHLEHRNFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQIIICASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTSLRKKPTQKILSIEELNNK 

SEQ ID NO. 6620 

STRAIN 1169NT frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKLAYLSRHEGKGDIFKD 
PRLTYIKGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIRSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLGIWQKVFPTKVVIVAEAIVTTLRTKPTQKILSIEELNNK 

SEQ ID NO. 6621 

STRAIN CJB110 frame: 1 

TRHIKI S I LNLQNEGEGTMEILI AGGSGFLGKQI IKAALTKGHKVAYLSRHEGKGDI FKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYI SANSGYS AYIKSKRKAEQI IKASGLDYLFVRPGLMYGEERPLS I FQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6622 

STRAIN JM9130013 frame: 1 

TRHIKI S I LNLQNEGEGTME I LIAGGSGFLGKQI IKAALTKGHKVAYLSRHEGKGDI FKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYI SANSGYS AYIKSKRKAEQI IKASGLDYLFVRPGLMYGEERPLS I FQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6701 
STRAIN 090 

C AAT AAC AAC AT T T G AAAAT AAAAAAGT T T T AGT CCTTGGTT TAG C AC G A 
TCTGGAGAAGCCGCTGCACGTTTGTTAGCTAAGTTAGGAGCAATAGTGAC 
AGT T AAT GAT G G C AAAC CAT T T GAT G AAAAT C C AAC AG C AC AGT CT T T GT 
TGGAAGAGGGTATTAAAGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTA 
GAT G AGG AT T T T T GT T AC AT GAT T AAAAAT C C AG G AAT AC C T TAT AAC AA 
T CCT AT GGT C AAAAAAG CAT T AG AAAAAC AAAT CCCTGTTTT G AC T G AAG 
TGGAATTAGCATACTTAGTTTCAGAATCTCAGCTAATAGGTATTACAGGC 
T CT AAC G G G AAAAC G AC AAC G AC AAC GAT GAT T G C AG AAGT C T T AAAT G C 
TGGAGGTCAGAGAGGTTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTG 
AAGT TGT T C AGGC T G C G GAT G AT AAAG AT AT T C TAG T TAT GG AAT TAT C A 

AGTTTTCAGCTAATGGGAGTTAAGGAATTTCGTCCTCATATTGCAGTAAT 
T AC T AAT T T AAT G C C AAC T C AT TT AGAT TAT CAT GGGTCTTTT GAAGAT T 
ATGTTGCTGCAAAATGGAATATCCAAAATCAAATGTCTTCATCTGATTTT 
TTGGTACTTAATTTTAATCAAGGTATTTCTAAAGAGTTAGcTAAAACTAC 
TAAAGCAACAATCGTTCCTTTCTCTACTACGGAAAAAGTTGATGGTGCTT 
AC GT AC AAG AC AAG C AAC T T T T CT AT AAAGGG G AG AAT AT T AT GT T AGT A 
GATGACATTGGTGTCCCAGGAAGCCATAACGTAGAGAATGCTCTAGCAAC 
TAT TGCGGTTGC T AAAC TAG C T GGT AT C AGT AAT C AAGT TAT T AGAG AAA 
CTTTAAGCAATTTTGGAGGTGTTAAACACCGCTTGCAATCACTCGGTAAG 
GT T CAT G G T AT T AGT T T C TAT AAC G AC AG C AAG T C AAC T AAT AT AT T GG C 
AACTCAAAAAGCATTATCTGGCTTTGATAATACTAAAGTTATCCTAATTG 
C AG GAG G T CT T GAT C G C G GT AAT G AGT T T GAT G AAT T GAT AC C AG AT AT C 
AC T GG AC T T AAAC AT AT GGT T GT T T T AG GG G AAT C G G CAT C T C GAG T AAA 
AC G T G CT G C AC AAAAAG C AGG AGT AAC T TAT AG C GAT G C T T T AG ATG T T A 
GAG AT G C GGT AC AT AAAG C T T AT GAG GT G G C AC AAC AG G G C GAT G T TAT C 
T T G C T AAGT C CT G C AAAT G C AT C AT GGG AC AT GT AT AAG AAT T T C G AAG T 
CCGTGGTGATGAATTCATTGATACtTTCGAAAGTCTTAGAGGAGAG 



SEQ ID NO. 6702 
STRAIN A90 9 

CAATAACAACATTTGAAAAT AAAAAAGT TTT AGT CCT T GGT T TAGCACG A 
T C T GGAG AAG CT G C T G C AC GT T T GT T AG CT AAG T T AG GAG C AAT AG T G AC 
AG T T AAT GAT GG C AAAC CAT T T GAT G AAAAT C C AAC AG C AC AGT C T T T GT 
TGGAAGAGGGTATTAAAGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTA 
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GAT GAGGAT T TTTGTTACAT GAT TAAAAATCCAGGAAT ACCT T ATAACAA 
T C C T AT G GT C AAAAAAGC AT T AGAAAAAC AAAT CCCTGTTTT G ACT G AAG 
T GGAATT AGCATACTTAGTT T CAGAAT CT CAGCT AATAGGT ATTACAGGC 
T C T AAC G G GAAAAC GAC AAC G AC AAC GAT GAT T G C AG AAGT CT T AAAT G C 
TGGAGGTCAGAGAGGTTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTG 
AAGTT GTT CAGGCT GCGAAT GAT AAAGAT ACT CTAGTTAT GGAATT AT CA 
AGT T T T C AG CT AAT G GG AGT T AAG GAAT TTCGTCCT CAT AT T G C AGT AAT 
T AC T AAT T T AAT GC C AAC T CAT T TAG AT TAT CAT GGGTCTTTT GAAGAT T 
ATGTTGCTGCAAAATGGAATATCCAAAATCAAATGTCTTCATCTGATTTT 
T T G GT AC T T AAT T T T AAT C AAGG T AT T T C T AAAG AGT T AG CT AAAAC T AC 
T AAAGCaACAAT CGT T C CTTT CT CT ACT ACGGAAAAAGT T GAT GGTGCT T 
AC GT AC AAGAC AAG C AAC T T T T C T AT AAAGGG G AGAAT AT T AT G T C AGT A 
GAT GAC AT TGGTGTCC C AGG AAG C C AT AAC GT AnAGAAT G C T C T AGC AAC 
TATTGCGGTTGCTAAACTGGCTGGTATCAGTAATCAAGTTATTAgAGAAA 
CTTTAAGCAATTTTGGAGGtGTTAAACACCGCTTGCAATCACTCGGTAAG 
GTTCATGGTATTAGTTTCTATAACGACAGCAAGTCAACTAATATATTGGC 
AACT CAAAAAG CAT TAT CTGGCTTT G AT AAT ACT AAAGT T AT C C T AAT T G 
C AGG AG GT CT T GAT C GCG GT AAT GAGT T T GAT GAAT T G AT AC C AGAT AT C 
ACTGGACTTAAACATATGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAA 
ACGT G C T G C AC AAAAAGC AG GAG T AAC T TAT AG C GAT G CT T T AG AT GT T A 
GAGATGCGGTACATAAAGCTTATGAGGTGGCACAACAGGGCGATGTTATC 
T T G C T AAGT C C T GC AAAT G CAT CAT G GG AC AT GT AT AAG AAT T T C G AAGT 
C CGT G GT GAT GAAT T CAT T GAT ACT T T C G AAAGT C T TAG AGG AG AG 

SEQ ID NO. 6703 
STRAIN H3 6B 

GGACGAGTAATGAAAACAATAACAACATTTGAAAAT 

AAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGCTGCTGCACG 
T T T GT T AGC T AAGT T AGG AGC AAT AGT GAC AGT T AAT GAT G GC AAAC CAT 
TTGATGAAAATCCAACAGCACAGTCTTTGTTGGAAGAGGGTATTAAAGTG 
GTTTGTGGTAGTCATCCTTTAGAATTGTTAGATGAGGATTTTTGTTACAT 
GAT T AAAAAT C C AGG AAT AC C T TAT AAC AAT CC T AT G GT C AAAAAAG C AT 
T AGAAAAAC AAATCCCTGTTTTGACTGAAGTGGAATT AGCATACTTAGTT 
T CAGAAT CT C AG CT AAT AGGT AT T AC AG G C T C T AAC G G GAAAAC GAC AAC 
GACAACGATGATTGCAGAAGTCTTAAATGCTGGAGGTCAGAGAGGTTTGT 
TAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGCTGCGAAT 
GAT AAAGAT ACT CTAGTTATGGAATT AT CAAGTTTTCAGCTAATGGGAGT 
TAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATGCCAACTC 
ATTTAGATTATCATGGGTCTTTTGAAGATTATGTTGCTGCAAAATGGAAT 
ATCCAAAATCAAATGTCTTCATCTGATTTTTTGGTACTTAATTTTAATCA 
AGGT ATTTCT AAAG AGTTAGCTAAAACT ACT AAAG C AAC AAT CGTTCCTT 
T C T C T AC T AC G G AAAAAG T T GAT GGTGCT T AC GT AC AAG AC AAG C AAC T T 
TTCTATAAAGGGGAGAATATTATGTCAGTAGATGACATTGGTGTCCCAGG 
AAGC C AT AAC GT AG AG AAT G CT C T AG C AACT AT TGCGGTTGC T AAAC T G G 
CT G GT AT C AG T AAT C AAG T TAT TAG AG AAAC T T T AAG C AAT T T T G G AGG T 
GTTAAACACCGCTTGCAATCACTCGGTAAGGTTCATGGTATTAGTTTCTA 
TAACGACAGCAAG 

SEQ ID NO. 6704 
STRAIN 18RS21 

GG AC GAGT AAT GAAAAC AAT AAC AAC AT T T G 

AAAATAAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGCTGCT 
GCACGTTTGTTAGCTAAGTTAGGAGCAATAGTGACAGTTAATGATGGCAA 
AC CAT T T GAT G AAAAT C C AAC AG C AC AG T C T T T GT T GGAAG AG G GT AT T A 
AAGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTAGATGAGGATTTTTGT 
TAC AT GAT T AAAAAT CCAGG AAT ACCTT AT AAC AAT CCTATGGTCAAAAA 
AGCATTAGAAAAACAAATCCCTGTTTTGACTGAAGTGGAATTAGCATACT 
TAGTTTCAGAATCTCAGCTAATAGGTATTACAGGCTCTAACGGGAAAACG 
AC AAC GAC AAC GAT GAT T G C AG AAGT C T T AAAT G C T GG AG G T C AG AG AG G 
TTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGCTG 
C GAAT GAT AAAGAT AC T C T AG T T AT GG AAT TAT C AAGT T T T C AG C T AAT G 
GGAGTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATGCC 
AACTCATTTAGATTATCATGGGTCTTTTGAAGATTATGTTGCTGCAAAAT 
GGAATATCCAAAATCAAATGTCTTCATCTGATTTTTTGGTACTTAATTTT 
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AATCAAGGTATTTCTAAAGAGTTAGCTAAAACTACTAAAGCAACAATCGT 
T CC T T T CT C TACT AC G G AAAAAG T T GAT GGTG CT T ACGT AC AAG AC AAG C 
AACTTTTCTATAAAGGGGAGAATATTATGTCAGTAGATGACATTGGTGTC 
CCAGGAAGCCATAACGTAGAGAATGCTCTAGCAACTATTGCGGTTGCTAA 
ACT GG C T GGT AT C AGT AAT C AAGT TAT T AG AG AAACT T T AAG C AAT T T T G 
G AGGT GT T AAAC AC C G CT T G C AAT C AC T C GGT AAGGT T C AT GGT AT T AGT 
T T CT AT AAC GAC AG C AAGT C AACT AAT AT AT T GG C AAC T C AAAAAGC AT T 
ATCTGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTTGATC 
GCGGTAATGAGTTTGATGAATTGATACCAGATATCACTGGACTTAAACAT 
ATGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAAACGTGCTGCACAAAA 
AG C AG G AGT AACT TAT AG C GAT G CT T TAG AT GT T AG AGAT GC G G T AC AT A 
AAGCTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCCTGCA 
AAT G CAT CAT GGG AC AT GT AT AAG AAT TT C GAAGT C CG T GGT GAT G AAT T 
CATTGATACTTTCGAAAGTCTTAGAGGAGAG 

SEQ ID NO. 6705 
STRAIN M732 

GGACGAGT AAT GAAAACAAT AAC AAC AT T T GAAA 

ATAAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGCCGCTGCA 
C GT T T GT T AG C T AAG T T AG GAG C AAT AGT GAC AGT T AAT GAT G G C AAAC C 
AT T T GAT G AAAAT CC AAC AGC AC AGT C TT T GT T GG AAG AG GGT AT T AAAG 
TGGTTTGT GGT AGT CAT C C T T TAG AAT TGT TAG AT G AGGAT T T T T GT T AC 
AT GAT T AAAAAT C C AG G AAT AC C T TAT AAC AAT C CT AT GGT C AAAAAAGC 
ATT AGAAAAACAAATCCCTGTTTTGACT GAAGT GGAATT AGC AT ACT TAG 
T T T C AG AAT CT C AG CT AAT AGG T AT T AC AG G C T CT AACGGG AAAAC GAC A 
AC G AC AACGAT GAT T G C AG AAGT C T T AAAT G C T GG AGGT C AGAG AG GT T T 
GTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGCTGCGG 
a T GAT AAAG AT AT T CT AG T T AT G G AAT TAT C AAGT T T T C AG C T AAT G G GA 
GTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATGCCAAC 
TCAtTTAGATTATCATGGGTCTTTTGAAGATTATGtTGCTGCAAAATGGA 
ATATCCAAAATCAAATGTCTTCATCTGATTTTTTGGTACTTAATTTTAAT 
C AAGGT ATT TCT AAAG AGTT AGC TAAAACT ACT AAAGC AAC AaTCGTTCC 
TTTCTCTACTACGGAAAAAGTTGATGGTGCTTACGTACAAGACAAGCAAC 
T T T T CT AT AAAG GG GAG AAT AT TAT GT C AGT AG AT GAC AT TGGTGTCC C A 
GGAAGC CAT AACGTAGAGAATGCTCT AGC AACT ATT GCGGTTGCT AAACT 
AGCTGGTATCAGTAATCAAGTTATTAGAGAAACTTTAAGCAATTTTGGAG 
GTGTTAAACACCGCTTGCAATCACTCGGTAAGGTTCATGGTATTAGTTTC 
T AT AACGACAGCAAGTCAACTAAT AT ATTGGCAACTC AAAAAGC ATT AT C 
TGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTTGATCGCG 
GTAATGAGTTTGATGAATTGATACCAGATATCACTGGACTTAAACATATG 
GTT GTTT T AGGGGAATCGGC AT CTCG AGT AAAACGTGCTGC AC AAAAAGC 
AGGAGTAACT T AT AGCGAT GCTT T AGATGT TAGAGAT GCGGT ACAT AAAG 
CTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCCTGCAAAT 
G CAT C AT GG GAC AT GT AT AAG AAT T T C GAAGT C C GT G GT GAT G AAT T CAT 
T GAT AC T T T C G AAAGT C T T AG AGG AG AG 

SEQ ID NO. 6706 
STRAIN COH1 

GG AC G AGT AAT GAAAACAAT AAC AAC ATTTG A 

AAATAAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGCCGCTG 
C AC G T T T GT T AG CT AAG T T AGG AG C AAT AGT GAC AGT T AAT G AT GG C AAA 
CC AT T T GAT G AAAAT CC AAC AG C AC AGT CTTTGTTG G AAG AG G G T AT T AA 
AGTGGTTTGT G GT AGT CAT C C T T TAG AAT T G T TAG AT GAGG AT T T T T G T T 
AC AT GAT T AAAAAT C C AG G AAT AC C T TAT AAC AAT CC TAT GGT C AAAAAA 
GC ATT AGAAAAAC AAAT CCCTGTTTTGACTG AAGT GGAATT AGC AT ACTT 
AGT TT C AG AAT CT C AGCT AAT AGG T AT T AC AG GC T CT AAC G G G AAAAC G A 
C AAC GAC AAC GAT GAT T G C AG AAG T CT T AAAT G C T GG AGGT C AG AG AG G T 
TTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGCTGC 
G GaT G AT AAAG AT ATT CT AGT TAT GGAATT AT C AAGT TTTC AGCT AATGG 
GAGTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATGCCA 
ACTCATTTAGATTATCATGGGTCTTTTGAAGATTATGTTGCTGCAAAATG 
GAATATCCAAAATCAAATGTCTTCATCTGATTTTTTGGTACTTAATTTTA 
AT C AAGGT AT T T CT AAAG AGT T AG C T AAAAC TACT AAAG CAaC AAT C GTT 
C CT T T C T CT ACT AC G G AAAAAGT T GAT GGTG C TT AC GT AC AAGAC AAG C A 
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ACT T T T CT AT AAAGGGG AG AAT AT TAT GT C AGT AG AT G AC AT T G GT GT C C 
C AG GAAG C C AT AACGT AG AG AAT G C T CT AG C AAC TAT T GC GGT T G CT AAA 
C T AG CT GGT AT C AGT AAT C AAGT T AT T AGAG AAAC T T T AAG C AAT T T T G G 
AG G T GT T AAAC ACC G CT T G C AAT C ACT C G G T AAG GT T CAT GGT AT T AGT T 
TCTATAACGACAGCAAGTCAACTAATATATTGGCAACTCAAAAAGCATTA 
TCTGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTTGATCG 
CGGT AAT GAGTTTGATGAAT TGAT ACCAGAT AT C ACT GGACTT AAAC ATA 
TGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAAACGTGCTGCACAAAAA 
G C AGG AGT AAC T TAT AG C GAT G C T T TAG AT GT T AG AG AT GCG GT AC AT AA 
AGCTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCCTGCAA 
AT GCAT CATGGGAC ATGT AT AAGAATTT CGAAGT CCGT GGTGAT GAAT T C 
ATTGATACTTTCGAAA 

SEQ ID NO. 6707 
STRAIN M781 

GGACGAGT AATGAAAACAAT AAC AACAT T 

TGAAAATAAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGCCG 
CTGCACGTTTGTTAGCTAAGTTAGGAGCAATAGTGACAGTTAATGATGGC 
AAACCATTTGATGAAAATCCAACAGCACAGTCTTTGTTGGAAGAGGGTAT 
TAAAGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTAGATGAGGATTTTT 
GTTACATGATTAAAAATCCAGGAATACCTTATAACAATCCTATGGTCAAA 
AAAGC AT T AGAAAAAC AAAT CCCTGTTTT G AC T G AAGT GG AAT T AGC AT A 
CTTAGTTTCAGAATCTCAGCTAATAGGTATTACAGGCTCTAACGGGAAAA 
CGACAACGACAACGATGATTGCAGAAGTCTTAAATGCTGGAGGTCAGAGA 
GGTTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGC 
T G CG G AT GAT AAAGAT AT T C T AGT T AT GGAAT TAT C AAGT T T T C AG C T AA 
TGGGAGTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATG 
CCAACTCATTTAGATTATCATGGGTCTTTTGAAGATTATGTTGCTGCAAA 
ATGGAAT AT C CAAAAT CAAAT GT CT T C AT CT GAT T TT T TGGT ACT T AATT 
T T AAT C AAG GT AT T T C T AAAG AG T T AG CT AAAACT AC T AAAG C Aa C AAT C 
GTTCCTTTCTCTACTACGGAAAAAGTTGATGGTGCTTACGTACAAGACAA 
G C AAC T T T T CT AT AAAG GG GAG AAT AT T AT GT C AGT AG AT G AC AT T G G T G 
T C C C AG GAAG C CAT AAC GT AG AG AAT G C T CT AG C AACT AT TGCGGTTGCT 
AAACT AG C T GGT AT C AG T AAT C AAGT T AT T AG AGAAAC T T T AAG C AAT T T 
TGGAGGTGTTAAACACCGCTTGCAATCACTCGGTAAGGTTCATGGTATTA 
G T T T C TAT AAC G AC AG C AAGT C AACT AAT AT AT T G G C AAC T C AAAAAG C A 
T TAT CTGGCTTT G AT AAT AC T AAAGT TAT C C T AAT T G C AGGAGGT C T T G A 
TCGCGGTAATGAGTTTGATGAATTGATACCAGATATCACTGGACTTAAAC 
ATATGGTTGTTTTAgGGGAATCGGCATCTCGAGTAAAACGTGCTGCACAA 
AAAG CAGG AGT a AC T TAT AG C GAT G C T T TAG AT GT TAG AG AT G C GG T AC A 
TAAAGCTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCCTG 
CAAATGCATCATGGGACATGT AT AAGAATTT CGAAGT CCGT GGT GAT GAA 
T T CAT T GAT ACT T T C G AAAG T C T T AGAGG AG AG 

SEQ ID NO. 6708 
STRAIN CJB110 

GGACGAGT AATGAAAACAAT AAC AAC AT T T G A 

AAATAAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGCCGCTG 
C AC GT T T GT TAG C T AAGT TAG GAG C AAT AGT GAC AGT T AAT GAT GG C AAA 
C C AT T T GAT G AAAAT C C AAC AG C AC AGT C T T T GT T GG AAG AG G G T AT T AA 
AGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTAGATGAGGATTTTTGTT 
AC AT G AT T AAAAAT C C AG GAAT AC C T TAT AAC AAT C C TAT GGT C AAAAAA 
G C AT TAG AAAAAC AAAT CCCTGTTTT G ACT GAAG T GGAAT TAG CAT AC T T 
AGTTTCAGAATCTCAGCTAATAGGTATTACAGGCTCTAACGGGAAAACGA 
C AAC GAC AAC GAT GAT T G C AG AAGT C T T AAAT G CT GGAGGT C AG AG AG GT 
TTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGCTGC 
GGAT GAT AAAGATATT CT AGTT AT GGAATT AT CAAGT TTT CAG CT AATGG 
GAGTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATGCCA 
ACTCATTTAGATTATCATGGGTCTTTTGAAGAATATGTTGCTGCAAAATG 
GAATATCCAAAATCAAATGTCTTCATCTGATTTTTTGGTACTTAATTTTA 
ATCAAGGTATTTCTAAAGAGTTAGCTAAAACTACTAAAGCAACAATCGTT 
CCTTTCTCTACTACGGAAAAAGTTGATGGTGCTTACGTACAAGACAAGCA 
ACT T T T C T AT AAAG G GG AG AAT AT TAT G T T AGT AG AT GAC AT TGGTGTCC 
CAGG AAG C CAT AAC GT AG AG AAT G C T C T AGC AAC TAT TGCGGTTG C T AAA 
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CTAGCTGGT AT CAGT AAT C AAGT TAT TAGAGAAACTTT AAGCAATTT T GG 
AGGTGTTAAACACCGCTTGCAATCACTCGGTAAGGTTCATGGTATTAGTT 
T CT AT AAT G AC AG C AAGT C AACT AAT AT AT T GG C AAC T C AAAAAG CAT T A 
TCTGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTTGATCG 
CGGT AATGAGT T T GAT GAAT T GAT AC C AGAT AT C AC T GG AC T T AAAC AT A 
TGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAAACGTGCTGCACAAAAA 
G C AG G AGT AACT T AT AG C GAT GC T T T AGAT GT T AG AG AT GCGGT AC AT AA 
AGCTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCCTGCAA 
ATGCAT CAT GGGACAT GTATAAGAATT TCGAAGT CCGTGGTGAT GAATT C 
ATTGATACTTTCGAAAGTCTTAGAGGAGAG 

SEQ ID NO. 6709 
STRAIN 1169NT 

C AAT AAC AAC AT T T G AAAAT AAAAAAGT T T T AGT C C T T G GT T T AG C AC G A 
TCTGGAGAAGCCGCTGCACGTTTGTTAGCTAAGTTAGGAGCAATAGTGAC 
AGT T AAT GAT G G C AAAC CAT T T GAT G AAAAT C C AAC AG C AC AGT CT T T GT 
TGGAAGAGGGTATTAAAGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTA 
GAT GAG GAT T T T T G T T AC AT GAT T AAAAAT C C AG GAAT AC CT TAT AAC AA 
TCCTATGGTCAAAAAAGCATTAGAAAAACAAATCCCTGTTTTGACTGAAG 
TGGAATTAGCATACTTAGTTTCAGAATCTCAGCTAATAGGTATTACAGGC 
T CT AAC GGG AAAACGAC AAC GAC AACGAT GAT T G C AG AAGT CT T GAAT G C 
TGGAGGTCAGAGAGGTTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTG 
AAGTTGTTCAGGCTGCGGATGATAAAGATACTCTAGTTATGGAATTATCA 
AGTTTTCAGCTAATGGGAGTTAAGGAATTTCGTCCTCATATTGCAGTAAT 
TACTAATTTAATGCCAACTCATTTAGATTATCATGGGTCTTTTGAAGAtT 
AT G t TGCTGCAAAATGGAAT AT CCAAAAT C AAATGT CT T CAT CT GAT TT T 
T T GGT AC T T AAT T T T AAT C AAGGT AT T T C T AAAGAGT T AG c T AAAAC T AC 
TAAAGCAACAATCGTTCCTTTCTCTACTACGGAAAAAGTTGATGGTGCTT 
ACGTACAAGACAAGCAACTTTTCTATAAAGGGGAGAATATTATGTCAGTA 
GAC GAC AT TGGTGTCC C AG G AAG C C AT AACGT AGAGAAT G CT CT AG C AAC 
TAT TGCGGTTGC T AAAC T AG CT G GT AT CAGT AAT C AAGT TAT TAG AG AAA 
CTTTAAGCAATTTTGGAGGTGTTAAACACCGCTTGCAATCACTCGGTAAG 
GTTCATGGTATTAGTTTCTATAACGACAGTAAGTCAACTAATATATTGGC 
AACTCAAAAAGCATTATCTGGCTTTGATAATACTAAAGTTATCCTAATTG 
C AGG AGGT CT T GAT C GC GG T AAT G AGT T T GAT GAAT T GAT AC C AG AT AT C 
ACTGGACTTAAGCATATGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAA 
ACGTGCTGCACAAAAAGCAGGAGTAACTTATAGCAATGCTTTAgATGTTA 
GAgATGCgGTACATAAAGCTTATGAGGTGGCACAACAGGGCGATGTTATC 
TT GTTmAGT c CTGCGAATGCAT CAT GGGACAT GT AT AAGAAT T T CGAAGT 
CCGTGGTGATGAATTCATTGATACTTTCG 

SEQ ID NO. 6710 

STRAIN JM9130013 

GGACGAGTAATGAAAACAATAACAACA 

TTTGAAAATAAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGC 
T G C T G C AC GT T T GT T AG CT AAG T TAG GAG C AAT AGT GAC AGT T AAT GAT G 
GC AAAC CAT T T GAT G AAAAT C C AAC AGC AC AGT C T T T G T T GG AAGAGGG T 
AT T AAAGT G GT T T GT G GT AGT CAT C CT T TAG AAT T G t T AG AT GAG GAT T T 
T T G T T AC AT GAT T a AAAAT C C AGGAAT AC C T TAT AAC AAT C C T AT GGT C A 
AAAAAG CAT TAG AAAAACAAAT CCCTGTTTT G ACT G AAGT GGAAT T AG C A 
TACTT AGT TTCAGAATCTCAGCT AAT AGGT ATT ACAGGCTCTAACGGGAA 
AAC GAC AAC GAC AAC GAT GAT T G C AG AAG T C T T AAAT G C T GG AG G T C AG A 
GAGGTTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAG 
G C T G CG AAT GAT AAAG AT AC T C T AGT TAT G GAAT TAT C AAGT T T T C AG C T 
AATGGGAGTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAA 
TGCCAACTCATTTAGATTATCATGGGTCTTTTGAAGATTATGTTGCTGCA 
AAATGGAAT AT CCAAAAT CAAAT GT CTT CAT CT GAT TT TT TGGTACT TAA 
TTTTAATCAAGGTATTTCTAAAGAGTTAGCTAAAACTACTAAAGCaACAA 
TCGTTCCTTTCTCTACTACGGAAAAAGTTGATGGTGCTTACGTACAAGAC 
AAGCAACTTTTCTATAAAGGGGAGAATATTATGTCAGTAGATGACATTGG 
TGTCCCAGGAAGC CAT AACGT AGAGAAT GCTCT AGC AACT AT TGCGGTTG 
CTAAACTGGCTGGTATCAGTAATCAAGTTATTAGAGAAACTTTAAGCAAT 
T TTGGAGGTGTT AAAC ACCGCTTGCAATC ACT CGGT AAGGT T CAT GGT AT 
TAG t T T CT AT AACG AC AGC AAGT C AACT AAT AT AT TGG C AACT C AAAAAG 
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CATTATCTGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTT 
GAT CGCAGTAAT GAGT T T GATGAAT TGAT ACCAGAT AT CACTGGACTTAA 
ACATATGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAAACGTGCTGCAC 
AAAAAG C AGG AGT AACT T AT AG C GAT G C T T T AGAT GT T AGAGAT G CGGT A 
CATAAAGCTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCC 
T GC AAAT G CAT C AT GGG AC AT GT AT AAG AAT T T C G AAG T C C GT G GT G AT G 
AAT T CAT T GAT AC t T T C G AAAGT CT T AG AGGAG AG 

SEQ ID NO. 6710 

STRAIN 2603 

ggacgagtaatgaaaacaataacaacatttgaaaataaaaaagttttagt 
ccttggtttagcacgatctggagaagctgctgcacgtttgttagctaagt 
taggagcaatagtgacagttaatgatggcaaaccatttgatgaaaatcca 
acagcacagtctttgttggaagagggtattaaagtggtttgtggtagtca 
tcctttagaattgttagatgaggatttttgttacatgattaaaaatccag 
gaataccttataacaatcctatggtcaaaaaagcattagaaaaacaaatc 
cctgttttgactgaagtggaattagcat act tagtttcagaatct cage t 
aataggtattacaggctctaacgggaaaacgacaacgacaacgatgattg 
cagaagtcttaaatgctggaggtcagagaggtttgttagctgggaatatc 
ggctttcctgctagtgaagttgttcaggctgcgaatgataaagatactct 
agttatggaattatcaagttttcagctaatgggagttaaggaatttcgtc 
ctcatattgcagtaattactaatttaatgccaactcatttagattatcat 
gggtcttttgaagattatgttgctgcaaaatggaatatccaaaatcaaat 
gtcttcatctgattttttggtacttaattttaatcaaggtatttctaaag 
agttagctaaaactactaaagcaacaatcgttcctttctctactacggaa 
aaagttgatggtgcttacgtacaagacaagcaacttttctataaagggga 
gaatattatgtcagtagatgacattggtgtcccaggaagccataacgtag 
agaatgctctagcaactattgcggttgctaaactggctggtatcagtaat 
caagttattagagaaactttaagcaattttggaggtgttaaacaccgctt 
gcaatcactcggtaaggttcatggtattagtttctataacgacagcaagt 
caactaatatattggcaactcaaaaagcattatctggctttgataatact 
aaagttatcctaattgcaggaggtcttgatcgcggtaatgagtttgatga 
attgataccagatatcactggacttaaacatatggttgttttaggggaat 
cggcatctcgagtaaaacgtgctgcacaaaaagcaggagtaacttatagc 
gatgctttagatgttagagatgcggtacataaagcttatgaggtggcaca 
acagggcgatgttatcttgctaagtcctgcaaatgcatcatgggacatgt 
ataagaatttcgaagtccgtggtgatgaattcattgatactttcgaaagt 
cttagaggagag 

SEQ ID NO. 6711 

STRAIN 090 frame: 3 

ITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGIKWCGS 
HPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGITGSNGK 
TTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAADDKDILVMELSSFQLMGVKEFRPHI 
AVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTKATIVPF 
STTEKVDGAYVQDKQLFYKGENIMLVDDIGVPGSHNVENALATIAVAKLAGISNQVIRET 
LSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLDRGNEFD 
ELIPDITGLKHMVVLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGDVILLSP 
ANAS W DM YKN FEVRGDEFI DTFES LRGE 

SEQ ID NO. 6712 

STRAIN A90 9 frame: 3 

ITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGIECVVCGS 
HPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGITGSNGK 
TTTTTMIAEVLNAGGQRGLLAGNIGFPASEVVQAANDKDTLVMELSSFQLMGVKE FRPHI 
AVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTKATIVPF 
STTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVXNALATIAVAKLAGISNQVIRET 
LSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLDRGNEFD 
ELIPD I TG LKHMV VLGE S AS RVKRAAQKAGVT Y S DAL D VRD AVHK A YE VAQQG DVILLSP 
ANAS WDM YKN FEVRGDEFI DT FE S LRGE 

SEQ ID NO. 6713 

STRAIN H3 6B frame: 1 

GRVMKT ITT FENKKVLVLGLARS GE AAARLLAKLGAI VT VN DGKP FDEN PTAQS LLEEGI 
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KWCGSHPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAANDKDTLVMELSSFQLMGVK 
EFRPHIAVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTK 
ATIVPFSTTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVENALATIAVAKLAGISN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSK 

SEQ ID NO. 6714 

STRAIN 18RS21 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGI 
T G S N G KT T T T TM I AE VLN AGG QRG L L AGN IGF PA S E V VQ AAN DKDT L VME L S S FQ LMG VK 
E FRPHI AVITNLMPTHLDYHGS FE DYVAAKWNIQNQMS S SDFLVLNFNQGI SKELAKTTK 
ATIVPFSTTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVENALATIAVAKLAGISN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RGNE FDELIPDITG LKHMWLGE S AS R VKRAAQKAG VT Y S D AL D VR D AVHKA YE VAQ QG D 
VILLS PAN AS WDM YKN FEVRG DEFIDTFES LRGE 

SEQ ID NO. 6715 

STRAIN M732 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGI PYNNPMVKKALEKQI PVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAADDKDILVMELSSFQLMGVK 
EFRPHIAVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSS SDFLVLNFNQGI SKELAKTTK 
ATIVPFSTTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVENALATIAVAKLAGISN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RGNEFDELIPDITGLKHMWLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGD 
VILLS PANASWDMYKNFEVRGDEFIDTFESLRGE 

SEQ ID NO. 6716 

STRAIN COH1 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGI PYNNPMVKKALEKQI PVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAADDKDILVMELSSFQLMGVK 
EFRPHIAVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSS SDFLVLNFNQGI SKELAKTTK 
AT IVPFSTTE K V D G A Y VQ D KQ L F YKGEN I M S V D D I G V P G S HN VENAL AT I AVAKL AG I S N 
QVIRETLSNFGGVKHRLQS LGKVHGI S FYNDSKS TNI LATQKALSGFDNTKVILI AGGLD 
RGNEFDELIPDITGLKHMWLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGD 
VILLS PAN AS W DM YKN FEVRG DE FI DT FE 

SEQ ID NO. 6717 

STRAIN M7 81 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGI PYNNPMVKKALEKQI PVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAADDKDILVMELSSFQLMGVK 
EFRPHI AVI TNLMPTHLDYHGSFEDYVAAKWNIQNQMSS SDFLVLNFNQGI SKELAKTTK 
ATIVPFSTTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVENALATIAVAKLAGISN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RGNE F DEL I PD I TGLKHMVVLGE S AS R VKRAAQKAG VTY S DAL DVR D AVHKA YE VAQQG D 
VILLS P ANAS W DM YKN FE VRG DE FI DT FE S LRGE 

SEQ ID NO. 6718 

STRAIN CJB110 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGI PYNNPMVKKALEKQI PVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEVVQAADDKDILVMELSSFQLMGVK 
EFRPHI AVI TNLMPTHLDYHGSFEEYVAAKWNIQNQMSS SDFLVLNFNQGI SKELAKTTK 
ATIVPFSTTEKVDGAYVQDKQLFYKGENIMLVDDIGVPGSHNVENALATIAVAKLAGISN 
QVIRETLSNFGGVKHRLQS LGKVHGI SFYNDSKSTNILATQKALSGFDNTKVI LI AGGLD 
RGNEFDELIPDITGLKHMVVLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGD 
VILLS PANASWDMYKNFEVRGDEFIDTFESLRGE 

SEQ ID NO. 6719 

STRAIN 1169NT frame: 3 

I TT FENKKVLVLGL ARS GEAAARLL AKLGAI VT VN DGKP FDEN PT AQS L LEE GIKVVCGS 
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HPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGITGSNGK 
TTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAADDKDTLVMELSSFQLMGVKEFRPHI 
AVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTKATIVPF 
STTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVENALATIAVAKLAGISNQVIRET 
LSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLDRGNEFD 
ELIPDITGLKHMWLGESASRVKRAAQKAGVTYSNALDVRDAVHKAYEVAQQGDVILXSP 
ANASWDMYKNFEVRGDEFIDTF 

SEQ ID NO. 6720 

STRAIN JM9130013 frame: 1 

GRVMKT ITT FENKKV L VL G L AR S GE AAAR L L AKLG A I VT VN D GK P F DE N P T AQ S L LE E G I 
KWCGSHPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAANDKDTLVMELSSFQLMGVK 
EFRPHIAVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTK 
AT I VP FS T TEKVDGAYVQ DKQL F YKGEN IMS VD D I GVPGS HN VENAL AT I AVAKLAG I SN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RSNEFDELIPDITGLKHMWLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGD 
VILLS PAN AS W DMYKN FEVRGDE F I DT FE SLRGE 

SEQ ID NO. 6721 

STRAIN 2 603 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAANDKDTLVMELSSFQLMGVK 
EFRPHIAVITNLMPTHLDYHGSFEDYVAABCWNIQNQMSSSDFLVLNFNQGISKELAKTTK 
AT IVPFSTTE KV D GA Y VQ DKQL F YKGE NIMSVDDIGVPGS HN VENAL AT I AVAKLAG I S N 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RGNEFDELIPDITGLKHMVVLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGD 
VILLS PANASWDMYKNFEVRGDEFIDT FES LRGE 

SEQ ID NO. 6801 
STRAIN 2603 

AT GGC T AAAG AG AGG GT AG AT GT T C T T GC C T AT AAAC AG GG AC T T T T T GAT AC AC GAG AG 
CAAGCGAAACGT GGTGT T AT GGC AGGAAT GGTGAT T AAC GT T AT CAAT GGAGAACGT TAT 
GAT AAAC C AGGT G AAAAG G T T G C AG AC GAT AC T G AAT T AAAAC T AAAAGGT G AAAAAC T A 
AAAT AT GT T AGT AGAGGT GG AT T G AAAT T AG AAAAAG C T T T AC AAGT T T T T G AAAT T T C A 
GTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGGTGGTTTTACTGATGTTATG 
CT AC AAT C AG G AGC G CGT T T AG T T T AC G C AGT AG AT GT AGG AAC AAAT CAAT T AGT T T G G 
AAGTTACGTCAGGATCATCGTGTTCGTTCTATGGAACAATATAATTTTAGGTATGCCCAA 
AAAGAAGATTTCAAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCT 
CTTAATTTGATTTTACCAGCTCTAAAAGAAATTTTAGTGGATGGTGGACAAGTAGTGGCA 
T T AAT T AAAC C AC AAT T T GAAG C AG GT CGT GAG C AAAT T G G T AAAAAT G GT AT T G T C AAA 
GACAAGTTGGTTCATGAAAAGGTTTTGACAACAGTGACCAATTTCACGAAAGATTATGGA 
TATACGGTTAAACATCTTGATTTTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTT 
T T AAT GC AT T T G C AAAAG T GT C AAG AT C C AC AAAAT CTTGTGCTT G AC C AAAT AC AAG AT 
GT TAT AG AAAAAG C AC AT AAGGAAT T TAAG AAAAAT GAAG AAG AG 

SEQ ID NO. 6802 

STRAIN 090 

GCTAAAGAGAGGGTAGATGTTCTTGCCT 

AT AAAC AG GG AC T T T T T GAT AC AC GAG AG C AAG C G AAAC GT GGTGT TAT G 
G C AG G AAT G GT GAT T AAC GT TAT CAAT G GAG AAC G T T AT GAT AAAC C AG G 
TGAAAAGGT TGCAGACG AT AC TGAATT AAAACT AAAAGGT G AAAAAC T AA 
AAT AT G T T AGT AG AG G T G GAT T G AAAT T AGAAAAAGC T T T AC AAG T T T T T 
GAAATTTCAGTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGG 
TGGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAG 
TAG AT GT AGG AAC AAAT C AAT T AGT T T G GAAG T T ACGT C AG GAT CAT CGT 
GTTCGTTCTATGGAACAATATAATTTTAGGTATGCCCAAAAAGAAGATTT 
C AAGG AGGG AC T G C C T G AAT T T G C AT CG AT AG AT G T C T CAT T T AT C T C T C 
TTAATTTGATTTTACCAGCTCTAAAAGAAATTTTAGTGGATGGTGGACAA 
GT AGT GG CAT T AAT T AAAC C AC AAT T T GAAG C AGGT CGT GAG C AAAT T G G 
T AAAAAT GGT AT T GT C AAAG AC AAG T T G GT T CAT G AAAAGG T T T T G AC AA 
C AGT G AC CAAT T T C AC G AAAG AT TAT G G AT AT ACG G T T AAAC AT CT T GAT 
TTTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTT 
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G C AAAAGT GT C AAG AT C C AC AAAAT CT T GT GCT T G AC CAAAT AC AAGAT G 
TT ATAGAAAAAGC ACATAAGGAATTT AAGAAAAAT GAAGAAGAG 

SEQ ID NO. 6803 

STRAIN A909 

G C T AAAG AGAG GGT AG AT GT T C T T G C CT A 

TAAACAGGGACTTT TTGATACACGAGAGCAAGCGAAACGTGGT GT TAT GG 
CAGGAATGGTGATTAACGTTATCAATGGAGAACGTTATGATAAACCAGGT 
GAAAAGGT T GC AGAC GAT ACT G AAT T AAAACT AAAAGGT G AAAAAC T AAA 
AT AT GT TAG TAG AG G T GG AT T G AAAT T AG AAAAAG C T T T AC AAGT T T T T G 
AAAT T TC AGT T G C AG AT AAG CT AACT AT AGAT AT T G G C GCC T CT ACGGG T 
GGT T T TACT GAT GT TAT G C T AC AAT C AGG AG C G C GT T T AGT T T ACG C AGT 
AGATGTAGGAACAAATCAATTAGTTTGGAAGTTACGTCAGGATCATCGTG 
T T C GT T CT AT G G AAC AAT AT AAT T T T AG GT AT GCC C AAAAAG AAG AT T T C 
AAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCT 
T AAT T T GAT T T T AC C AGCT CT AAAAG AAATT T T AGT G GAT G GT GG AC AAG 
T AGT GGCATT AATT AAAC CACAAT TTGAAGCAGGTCGT GAGC AAATT GGT 
AAAAAT G GT AT T GT C AAAGAC AAGT T GGT T CAT G AAAAG GT T T T GAC AAC 
AGT G AC C AAT TT C AC GAAAGAT TAT GG AT AT ACGGT T AAAC AT C T T GAT T 
T T T CG C C CAT T C AAGGT GGAC AT GGAAAT AT T GAG T T T T T AAT G CAT T T G 
C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT GAC C AAAT AC AAGAT GT 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6804 

STRAIN H36B 

G C T AAAG AG AG G GT AG AT GT T C T T G C CT AT AAAC AGG 
GAC T T T T T GAT AC AC G AG AGC AAG C GAAAC GT GG T GT TAT G G C AG G AAT G 
GTGATTAACGT TAT C AAT GGAGAACGTTATGAT AAAC CAGGTGAAAAGGT 
T G C AG AC GAT AC T G AAT T AAAAC T AAAAGGT GAAAAACT AAAAT AT G T T A 
GT AG AGGT G GAT T G AAAT TAG AAAAAG C T T T AC AAGT T T T T G AAAT T T C A 
GTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGGTGGTTTTAC 
TGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGTAGATGTAG 
G AAC AAAT C AAT T AGT T T GGAAGT T AC G T C AGG AT CAT CGTGTTCGTTCT 
AT GG AAC AAT AT AAT T T T AG GT AT GCC C AAAAAG AAG AT T T C AAGGAGG G 
ACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCTTAATTTGA 
T T T T AC C AG C T C T AAAAG AAAT T T T AGT G GAT GGT GGAC AAGT AGT GG C A 
TTAATT AAAC CACAAT T TGAAGC AGGT CGT GAG CAAAT T GGT AAAAAT GG 
TAT T GT C AAAG AC AAGT T GGT T CAT GAAAAGGT T T T GAC AAC AGT GAC C A 
AT T T C AC GAAAGAT TAT G GAT AT AC GGT T AAAC AT C T T GAT TTTTCGCCC 
AT T C AAG GT G GAC AT GGAAAT AT T G AGT T T T T AAT G C AT T T G C AAAAGT G 
T C AAG AT C C AC AAAAT CT T G T G C T T GAC C A^AT AC AAG AT GT T AT AG AAA 
AAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6805 

STRAIN 18RS21 

GCTAAAGAGAGGGTAGATGTTCTTGCCTA 

T AAAC AG G GAC T T T T T GAT AC AC GAG AG C AAG C G AAAC GT GGT GT TAT G G 
C AGG AAT GGT G AT T AAC GT T AT C AAT GG AG AAC G T T AT GAT AAAC C AG GT 
G AAAAG G T T G C AG AC GAT ACT G AAT T AAAAC T AAAAGGT G AAAAAC T AAA 
AT AT GT T AGT AG AG GT G GAT T G AAAT T AG AAAAAGCT T T AC AAGT T T T T G 
AAAT T T C AGT T G C AG AT AAG C T AAC TAT AG AT AT TGGCGCCT C T AC GG G T 
GGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGT 
AGAT GT AGG AAC AAAT C AAT T AGT T T G G AAGT TAG G T C AG GAT CAT C GT G 
TTCGTTCTATGGAAC AAT AT AATTTT AGGT ATGCCC AAAAAG AAG ATT TC 
AAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCT 
TAAT TTG AT TTTACC AGCT CT AAAAG AAATTTT AGT GGATGGT GGAC AAG 
TAGTGGCATTAATTAAACCACAATTTGAAGCAGGTCGTGAGCAAATTGGT 
AAAAATGGT ATT GT CAAAGAC AAGTTGGT T C ATGAAAAGGT T T T GAC AAC 
AG T GAC C AAT T T C AC GAAAGAT TAT GG AT AT ACG GT T AAAC AT C T T G AT T 
TTTCGCCCATTCAAGGT GGAC AT GGAAAT AT TGAGTTTTT AAT GCAT TTG 
C AAAAGT GT C AAG AT C C AC AAAAT CT T GT G C T T GAC CAAAT AC AAG AT GT 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6806 
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STRAIN M732 

GCTAAAGAGAGGGTAGATGTTCTTGCCTA 

TAAACAGGGACTTTTTGATACACGAGAGCAAGCGAAACGTGGTGTTATGG 
CAGGACTGGTGATTAACGTTATCAATGGAGAACGTTATGATAAACCAGGC 
GAAAAGGTTGCAGACGATACTGAATTAAAACTAAAAGGTGAAAAACTAAA 
ATATGTTAGTAGAGGTGGATTGAAATTAGAAAAAGCTTTACAAGTTTTTG 
AAATTTCAGTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGGT 
GGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGT 
AGAT GT AGG AAC AAAT C AAT T AGT T T GG AAG T T AC GT C AG GAT CAT C GT G 
TTCGTTCTATGGAACAATATAATTTTAGGTATGCCCAAAAAGAAGATTTC 
AAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCT 
TAATTTGATTTTACCAGCTCTAAAAGAAATTTTAGTGGATGGTGGACAAG 
TAGTGGCATTAATTAAACCACAATTTGAAGCAGGTCGTGAGCAAATTGGT 
AAAAAT GG TAT T GT C AAAGAC AAGT T GGT T CAT G AAAAGG T T T T G AC AAC 
AGT G AC C AAT T T C ACGAAAG AT TAT G GAT AT AC GGT T AAAC AT C T T GAT T 
TTTCGCCCGTTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTTG 
CAAAAGT GT CAAGAT CCACAAAAT CTTGT GCT TGACCAAAT ACAAGATGT 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6807 

STRAIN COH1 

GCTAAAGAGAGGGTAGATGTTCTTGCCT 

AT AAAC AGG G AC T T T T T GAT AC AC GAG AG C AAG C G AAAC GT G GT G T TAT G 
G C AGG AC T G GT GAT T AAC GT TAT C AAT GG AG AAC G T T AT GAT AAAC C AG G 
C G AAAAGGT T G C AGAC GAT AC T GAAT T AAAAC T AAAAG GT G AAAAAC T AA 
AATATGTTAGTAGAGGTGGATTGAAATTAGAAAAAGCTTTACAAGTTTTT 
GAAAT T T C AGT T G C AG AT AAG C T AACT AT AGAT AT TGGCGCCTC T AC GG G 
TGGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAG 
TAGATGTAGGAACAAATCAATTAGTTTGGAAGTTACGTCAGGATCATCGT 
G T T CGT T C T AT G G AAC AAT AT AAT T T T AGGT AT G C C C AAAAAG AAG AT T T 
C AAG GAG G G AC T GC C T GAAT T T G C AT C GAT AG AT GT C T CAT T T AT C T C T C 
TTAATTTGATTTTACCAGCTCTAAAAGAAATTTTAGTGGATGGTGGACAA 
GTAGTGGCATTAATTAAACCACAATTTGAAGCAGGTCGTGAGCAAATTGG 
TAAAAATGGTATTGTCAAAGACAAGTTGGTTCATGAAAAGGTTTTGACAA 
C AGT G AC C AAT T T C AC GAAAG AT T AT G G AT AT ACGGT T AAAC AT C T T GAT 
T T TT CGCCCGTTCAAGGTGGACATGG AAAT ATT G AGT TTTTAATGC ATT T 
G CAAAAGT GT CAAGAT C C AC AAAAT CTTGTGCTT GAC C AAAT AC AAG AT G 
T TAT AG AAAAAG C AC AT AAGG AAT T T AAGAAAAAT G AAG AAG AG 

SEQ ID NO. 6808 

STRAIN M781 

GCTAAAGAGAGGGTAGATGTTCTTGCCT 

AT AAAC AGG GAC T T T T T GAT AC ACG AG AG C AAG C G AAAC G T G GT GT TAT G 
G C AGG ACT G GT GAT T AACG T T AT C AAT G G AGAAC G T T AT GAT AAAC C AG G 
CGAAAAGGTTGC AG ACG AT ACT G AATTAAAACT AAAAGGT GAAAAACTAA 
AAT AT GT T AGT AG AGGT GG AT T GAAAT T AG AAAAAGC T T T AC AAGT T T T T 
GAAATTTCAGTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGG 
TGGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAG 
TAG AT GT AGG AAC AAAT C AAT T AGT T T G G AAG T T ACGT C AGG AT CAT CGT 
GTTCGTTCTATGGAACAATATAATTTTAGGTATGCCCAAAAAGAAGATTT 
CAAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTC 
TTAATTTGATTTTACCAGCTCTAAAAGAAATTTTAGTGGATGGTGGACAA 
GT AGT G G CAT T AAT T AAAC C AC AAT T T G AAGC AGG T CGT GAG C AAAT T G G 
TAAAAATGGTATTGTCAAAGACAAGTTGGTTCATGAAAAGGTTTTGACAA 
CAGTGACCAATTTCACGAAAGATTATGGATATACGGTTAAACATCTTGAT 
TTTTCGCCCGTTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTT 
GCAAAAGT GT CAAGAT CCACAAAAT CT TGT GCT TGACCAAAT AC AAGATG 
T T AT AGAAAAAG C AC AT AAGGAAT T T AAGAAAAAT G AAG AAG AG 

SEQ ID NO. 6809 

STRAIN CJB110 

GCTAAAGAGAGGGTAGATGTTCTTGCCTA 

T AAAC AG G G AC T T T T T GAT AC AC GAG AG C AAG C GAAACG T G GT GT T AT G G 
CAGGAATGGTGATTAACGTTATCAATGGAGAACGTTATGATAAACCAGGT 
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GAAAAG GT T GC AG AC GAT AC T G AAT T AAAAC T AAAAG GT G AAAAACT AAA 
AT AT GT T AGT AGAG GT G GAT T G AAAT T AG AAAAAG C T T T AC AAGT T T T T G 
AAATTTCAGTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGGT 
GGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGT 
AGAT GT AG G AAC AAAT C AAT T AGT T T G G AAG T T AC GT C AG GAT CAT C GT G 
T T C GT T C T AT GG AAC AAT AT AAT T T T AG GT AT GC C C AAAAAG AAG AT T T C 
AAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCT 
T AAT T T GAT T T T AC C AG C T CT AAAAGAAAT T T T AG T G G AT GG T G G AC AAG 
T AGT GG C AT T AAT T AAAC C AC AAT T T GAAG C AGGT CG T G AGC AAAT T G G T 
AAAAAT G GT AT T GT C AAAGAC AAGT T GG T T CAT GAAAAG GT T T T G AC AAC 
AGT G AC C AAT T T C AC G AAAG AT T AT GG AT AT ACG GT T AAAC AT C T T GAT T 
TTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTTG 
C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT G AC C AAAT AC AAG AT G T 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6810 

STRAIN 1169NT 

GCTAAAGAGAGGGTAGATGTTCTTGCCTA 

TAAACAGGGACTTTTTGATACACGAGAGCAAGCGAAACGTGGTGTTATGG 
CAGGACTGGTGATTAACGTTATCAATGGAGAACGTTATGATAAACCAGGC 
GAAAAG GT T G C AG AC GAT AC T G AAT T AAAAC T AAAAGGT G AAAAAC T AAA 
AT AT GT T AGT AG AGGT GG AT T GAAAT TAG AAAAAG C T T T AC AAGT T T T T G 
AAAT T T C AGT T G C AG AT AAG C T AAC TAT AG AT AT TGGCGCCT CT AC G GGT 
GGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGT 
AG AT GT AGG AAC AAAT C AAT T AGT T T G G AAGT T AC G T C AG GAT CAT C GT G 
T T C GT T C TAT GG AAC AAT AT AAT T T T AGGT AT GC C C AAAAAG AAG AT T T C 
AAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCT 
TAATTTGATTTTGCCAGCTCTAAAAGAAATTTTAGTGGATGGTGGACAAG 
TAGTGGCATTAATTAAACCACAATTTGAAGCAGGTCGTGAGCAAATTGGT 
AAAAAT G GT AT T GT C AAAG AC AAGT T GGT T CAT G AAAAGGT T T T G AC AAC 
AG T GAC C AAT T T C ACG AAAG AT TAT G GAT AT AC G GT T AAAC AT CT T GAT T 
TTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTTG 
C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT GAC C AAAT AC AAG AT GT 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6811 

STRAIN JM9130013 
GCTAAAGAGAGGGTAGATGTTCTTGCCTA 

T AAAC AGG GAC TT T T T GAT AC ACG AG AG C AAG C G AAAC G T GG T GT T AT GG 
C AG GAAT GGT GAT T AAC GT TAT C AAT GG AG AAC G T T AT GAT AAAC C AGGT 
G AAAAGGT T G C AG ACGAT ACT G AAT T AAAAC T AAAAG G T G AAAAAC T AAA 
AT AT GT T AGT AG AG GT GG AT T GAAAT TAG AAAAAG C T T T AC AAG T T T T T G 
AAAT T T C AGT T G C AG AT AAGC T AACT AT AG AT AT T GG C G C CT C T AC GGG T 
GGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGT 
AGAT GT AGGAACAAAT CAATTAGTT T GGAAGTT ACGT C AGGAT CAT CGT G 
TTCGTTCTATGGAACAATATAATTTTAGGTATGCCCAAAAAGAAGATTTC 
AAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCT 
T AAT T T GAT T T T AC C AG CT C T AAAAG AAAT T T T AGT GG AT G GT G GAC AAG 
TAGTGGCATTAATTAAACCACAATTTGAAGCAGGTCGTGAGCAAATTGGT 
AAAAATGGT AT TGT C AAAG AC AAGTTGGTTC AT G AAAAGGT TTT GAC AAC 
AG T GAC C AAT T T C AC G AAAG AT TAT G GAT AT AC GGT T AAAC AT C T T GAT T 
TTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTTG 
C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT G AC C AAAT AC AAG AT GT 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6812 
STRAIN 2 603 frame: 1 

MAKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 

YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 

LRQDHRVRSMEQYNFRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQVVAL 

IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 

MHLQKCQDPQNLVLDQIQDVIEKAHKEFKKNEEE 

SEQ ID NO. 6813 
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STRAIN 0 90 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYN FRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVXTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKEFKKNEEE 

SEQ ID NO. 6814 

STRAIN A909 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYNFRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQD PQNL VLDQ I QDVI EKAHKE FKKNEEE 

SEQ ID NO. 6815 

STRAIN 18RS21 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYN FRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6816 

STRAIN M732 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGLVINVINGERYDKPGEKVADDTELKLKGEKLK 
Y VSRGGLKLEKALQVFE I S VADKLT I D I GAS TGGFT DVMLQS GARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYN FRYAQKE D FKEGLPE FAS I DVS FI S LNL I LPALKE ILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPVQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6817 

STRAIN COH1 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGLVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFEI S VADKLT I DIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYN FRYAQKE DFKEGLPE FAS I DVS FIS LNL I LPALKE ILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPVQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6818 

STRAIN M781 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGLVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFEI SVADKLTIDIGAS TGGFT DVMLQS GARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYN FRYAQKE DFKEGLPE FAS I DVS FIS LNL I LPALKE ILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPVQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6819 

STRAIN CJB110 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFE I S VADKLT I DIGASTGGFT DVMLQS GARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYN FRYAQKE DFKEGLPE FAS I DVS FISLNLI LPALKE ILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6820 

STRAIN 1169NT frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGLVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVS RGGLKLEKALQVFE I S VADKLT I DI GASTGGFT DVMLQS GARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYNFRYAQKE DFKEGLPE FAS I DVSFISLNLILPALKEILVDGGQVVAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6821 
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STRAIN JM9130013 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
Y VS RGGLKLEKALQ V FE I S VA DKLT I D I GAS T GG FT D VMLQ S G ARL V YAVDVGTNQLVWK 

LRQDHRVRSMEQYNFRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKEFKKNEEE 

\ 

SEQ ID NO. 6822 

STRAIN H3 6B frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 

YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 

LRQDHRVRSMEQYNFRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQWAL 

IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKEFKKNEEE 

SEQ ID NO. 6901 
STRAIN 2603 

ATGAATAAAAAGGTACTATTGACATCGACAATGGCAGCTTCGCTATTATCAGTCGCAAGT 
GTTCAAGCACAAGAAACAGATACGACGTGGACAGCACGTACTGTTTCAGAGGTAAAGGCT 
GAT T T G GT AAAGC AAG AC AAT AAAT CAT CAT AT ACT GT G AAAT AT GGT G AT AC ACT AAG C 
GT T AT T T C AGAAG C AAT GT C AAT T GAT AT GAAT GT CT T AGC AAAAAT AAAT AAC AT TG C A 
GATATCAATCTTATTTATCCTGAGACAACACTGACAGTAACTTACGATCAGAAGAGTCAT 
AC T G C C ACT T C AAT GAAAAT AG AAAC AC C AG C AAC AAAT GCTGCTGGT C AAAC AAC AG C T 

ACTGTGGATTTGAAAACCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTCTCTCAATACA 
AT T T C G G AAG G T AT G AC AC C AG AAG C AG C AAC AAC GAT TGTTTCGC C AAT G AAG AC AT AT 
TCTTCTGCGC C AG C T T T G AAAT C AAAAG AAG T AT T AG C AC AAG AGC AAG C T GT T AGT C AA 
G C AG C AGC T AAT G AAC AG GT AT C AC C AGC T C C T G T G AAGT C GAT T AC T T C AG AAGT T C C A 
G C AG C T AAAG AGG AAGT T AAAC C AACT CAG AC GT C AG T C AGT C AG T C AAC AAC AG TAT C A 
CCAGCTTCTGTTGCCGCTGAAACACCAGCTCCAGTAGCTAAAGTAGCACCGGTAAGAACT 
GTAGCAGCCCCTAGAGTGGCAAGTGTTAAAGTAGTCACTCCTAAAGTAGAAACTGGTGCA 
TCACCAGAGCATGTATCAGCTCCAGCAGTTCCTGTGACTACGACTTCACCAGCTACAGAC 
AGT AAGT T AC AAG C GAC T G AAG T T AAGAG CGT T C C G GT AG C AC AAAAAG C T C C AAC AG C A 
AC AC C G G T AG C AC AAC C AG CT T CAAC AACAAAT GC AGT AG CT G C AC AT C C T GAAAAT G C A 

GGGCTCCAACCTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAACTTATGGAGTTAAT 
GAAT T CAG T AC AT AC C G T G C GG G AG AT C C AG GT G AT CAT G G T AAAG G T T TAG C AGT T GAC 

TTTATTGTAGGTACTAATCAAGCACTTGGTAATAAAGTTGCACAGTACTCTACACAAAAT 
AT GG CAG C AAAT AAC AT TT C AT AT GT TAT CT GG CAAC AAAAGT T T T AC T C AAAT AC AAAC 

AGTATTTATGGACCTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCC 
AAC C AC TAT GAC C AC GT T C AC G T AT CAT T T AAC AAAT AAT AT AAAAAAGG AAGC T AT T T G 
GCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGGTTCTTATATAATTTTTATTA 

SEQ ID NO. 6902 

STRAIN 0 90 

T GAG AC AAC AC T GAC AGT AAC T T AC GAT C AG AAGAGT CAT AC T G C C AC TT 
C AAT GAAAAT AG AAAC AC CAG CAAC AAAT GCTGCTGGT C AAAC AC CAG C T 
ACTGTGGATTTGAAAACCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTC 
T C T C AAT AC AAT T T C G G AAGG T AT GAC AC C AGAAG C AGC AAC AAC GAT T G ' 

TTTCGCCAATGAAGACATATTCTTCTGCGCCAGCTTTGAAATCAAAAGAA 
GTATTAGCACAAGAGCAAGCTGTTAGTCAAGCAGCAGCTAATGAACAGGT 
AT CAAC AG C T C C T GT G AAG T C GAT T AC T T CAG AAG T T CC AG CAG C T AAAG 
AG G AAG T T AAAC CAAC T C AG ACGT CAG T C AGT CAG T CAAC AAC AG TAT C A 

CCAGCTTCTGTTGCCGCTGAAACACCAGCTCCAGTAGCTAAAGTAGCACC 
G G T AAG AAC T GT AG CAG C C C C TAG AGT GG C AAG T GT T AAAG TAG T C AC T C 
C T AAAG TAG AAAC T G G T G CAT C AC CAG AG CAT G T AT CAG C T C CAG CAG T T 
C C T G T G AC T AC GAC T T CAAC AG C T AC AG AC AG T AAG T T AC AAG C G AC T G A 

AGTTAAGAGCGTTCCGGTAGCACAAAAAGCTCCAACAGCAACACCGGTAG 
C AC AAC CAG CT T CAAC AAC AAAT G CAG TAG C T G C AC AT C C T GAAAAT G C A 

GGGCTCCAACCTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAACTTA 
T G GAG T T AAT GAAT T C AGT AC AT AC C GT G C AG GT GAT C CAG GT G AT CAT G 

GTAAAGGTTTAGCAGTCGACTTTATTGTAGGTAAAAACCAAGCACTTGGT 
AAT G AAGT T G C AC AG TACT C T AC AC AAAAT AT G G C AG C AAAT AAC AT T T C 
AT AT GT TAT c T G G CAAC AAAAGT T T T ACT C AAAT AC AAAT AG TAT T TAT G 
GACCTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCC 
AACCAT TATGAC CATGT T C ACGT AT CATTT AACAAAT AAT AT AAAAAAGG 
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AAGCTATTTGGCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGGTTCT 
TATATAAT T TTTAT T A 

SEQ ID NO. 6903 

STRAIN A909 

CTGATTTGGTAAAGCAAGACAATAAATCATCATATACTGTGAA 
AT AT GGT GAT AC ACT AAG CGT T AT T T C AG AAGC AAT GT C AAT T GAT AT G A 
AT GT C T T AG C AAAAAT T AAT AAC AT T G C AG AT AT C AAT C T TAT T T AT C c T 
G AG AC AAC ACT G a C AGT AACT T AC GAT C AG AAG AGT C AT ACT G CT AC T T C 
AAT G AAAAT AG AAAC AC C AG C AAC AAAT GCTGCTGGT C AAAC Aa C AG c T A 
CTGTCGATTTGAAAACCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTCT 
CTC AATAC AAT T T CGGAAGGTAT GACACCAGAAGCAGCAAC AACGAT TGT 
T T CGCCAAT GAAGACATATT CTT cT GCGC CAGCTT T GAAAT CAAAAGAAG 
T AT T Ag C AC AAG G G C a AG C T GT TAG T C AAGC AG C AG C T AAT G AAC AG GT A 
T C Ac C AG C T c C T GT G AAGT C GAT TACT T C AG AAGT T C C Ag C AG C T AAAG A 
G G AAG T T AAAC C Aa C T C Ag AC GT C Ag T C AG T C AGT C AAC AAC AGT AT C AC 
CAgCTTCTGTTGCCGCTGAAACACCAGCTCCAgTAGCTAAaGTAGCACCG 
G T AAG AAC T GT AG C AGC C C C TAG AGT G G C AAG T GT T AAAGT AGT C ACT C C 
TAAAGTAGAAACTGGTGCATCACCAGAGCATGTATCAGCTCCAGCAGTTC 
CT GT GAC T AC G ACT T C AAC AGCT AC AG AC AGT AAGT T AC AAG C GACT GAA 
GTTAAGAGCGTTCCGGTAGCACAAAAAGCTCCAACAGCAACACCGGTAGC 
AC AAC C AGC T T C AAC AAC AAAT G C AG TAG C T G C AC AT C CT GAAAAT G C AA 
GGCTCCAACCTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAACTTAT 
GG AGT T AAT G AAT T c AGT AC AT AC C GT G C GGGAG AT C C AG GT GAT C AT GG 
TAAAGGTTTAGCAGTTGACTTTATTGTAgGTAAAAACCAAGCACTTGGTA 
AT G AAGT T G C AC AGT AC T C T AC AC AAAAT AT G GC AGC a AAT AAC AT T T C A 
TATGTTATCTGGCAACAAAAGTTTTACTCAAATaCAAATAGTATTTATGG 
ACcTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTAcTGCCA 
AC C a C TAT G AC C AC GT T C AC G T AT CAT T T AAC AAAT a AT AT AAAAAAGGA 
AGCTaTTTGGCTTCTTTTTTATATGCCTTGCATAGACtTTCAAGGTTCTT 
ATATAATTTTTATTA 

SEQ ID NO. 6904 

STRAIN H3 6B 

CT GAT T T GGT AAAGCAAGACAAT AAAT CAT C AT AT AcT GT GAAAT A 
T GG T GAT AC Ac T AAGCGT T AT T T C AG AAG C AAT G T C a AT T GAT AT G AAT G 
T CT T AG C AAAAAT T AAT AAC AT T G C AG AT AT C AAT C T TAT T T AT C c T GAG 
AC AAC a C T G a C AG T Aa C T T ACG AT C AG AAGAG T CAT AC T G CT ACT T C AAT 
GAAAAT AG AAAC AC C AGC AAC AAAT GCTGCTGGT C AAAC AAC AG C TACT G 
TCGATTTGAAAACCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTCTCTC 
AAT AC AAT T T C G G AAG G TAT GAC AC C AG AAG C AG C AAC AACGAT T GT T T C 
G C C AAT GAAG AC AT AT TCTTCTGCGC C AG CT T T GAAAT C AAAAG AAGT AT 
TAG C AC AAGG G C AAG C T GT T AGT C AAG C AG C AG C T AAT G AAC AG GT AT C A 
C C AG CT C CT GT G AAGT C GAT T AC T T C AG AAG T T C C AG C AG C T AAAG AG G A 
AGT T AAAC C AAC T C AG AC GT C AG T C AGT C AG T C AAC AAC AGT AT C AC C AG 
CTTcTGTTGCCGCT G AAAC AC C AG CTC C AGT AG c T AAAGT AG C AC C GG T A 
AG AAC T GT AG C AG C C C c TAG AGT G G C AAGT G T T AAAGT AGT C AC T C c T AA 
AGT AGAAAC T G GT G CAT C AC C AG AG CAT GT AT C AG C T CC AG C AGT T C C T G 
T GAC T AC GAC T T C AAC AG CT AC AG AC AGT AAG T T AC AAG C GAC T G AAGT T 
AAGAG C G T T C C G GT AG C AC AAAAAG CT C C AAC AG C AAC AC C G GT AG C AC A 
AC C AG CTT C AAC AAC AAAT G C AG T AG C T G C AC AT C CT GAAAAT GC AAG G C 
T C C AAC C T CAT GT T G C AG CT T AT AAAGAAAAAG T AG C GT C AACT TAT G G A 
G T T AAT G AAT T C AGT AC AT AC C G T G C G G GAG AT C C AG GT G AT CAT G GT AA 
AGGT T TAG C AG T T G AC T T T AT T G T AG G T AAAAAC C AAG C AC T T GGT AAT G 
AAGT T G C AC AG TACT C T AC AC AAAA t a T G G C AG C AAAT AAC AT T T CAT AT 
GTTATCTGGCaACAAAAGTTTTACTCAAATACAAATAGTATTTATGGACC 
TGCTAATACTTGGAATGCAATGCCAgATCGTGGTGGCGTTACTGCCAACC 
ACT AT GAC C AC G T T C AC GT AT CAT T T AAC AAAT AAT AT AAAAAAG GAAG C 
TATTTGGCTTCTTTTTTATATGCCTTGCATAGACtTTCAAGGTTCTTATA 
T AAT TTTT ATT A 

SEQ ID NO. 6905 

STRAIN 18RS21 

CT GAT T T G G T AAAG C AAG AC AAT 
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AAAT CAT CAT AT ACT GT GAAAT AT GGT GAT AC Ac T AAG cGTT ATT T CAGA 
AG C AAT GT C AAT T GAT AT GAAT GT C T TAG C AAAAa T AAAT AAC AT T GC AG 
AT AT C AAT CT TAT T TAT C c T G AGAC AAC a C T Ga C AGT AACT T AC G AT C AG 
AAGAGT CAT AC TGCCaCTT C AAT GAAAAT AG AAAC AC C AGC Aa C AAAT G C 
T G CT GGT C Aa AC Aa C AG CT AC T GT GG AT T T G AAAAC C AAT C Aa GT T T CT G 
T T G C AGAC C AAAAAGT T T C T C T C AAT AC AAT T T C GG AAG GT AT GAC AC C A 
GAAGCAGCAACAACGATTGTTTCGCCAATGAAGACaTATTCTTcTGCGCC 
AG CT T T G AAa T C AAAAG AAGT AT TAG C ACAAG AG C AAGC T GT T AGT C AAG 
CAGCAGCTAATGAACAGGTATCACCAGCTCCTGTGAAGTCGATTACTTCA 
G AAGT T C C AG C AG C T AAAG AG G AAGT T AAAC C AACT C AG ACGT C AGT C AG 
T C AGT C AAC AAC AGT AT C AC C AG CT T CTGTTGCCGCT G AAAC AC C AG CT C 
CAGTAGCTAAAGTAGCACCGGTAAGAACTGTAGCAGCCCCTAGAGTGGCA 
AGT GT T AAAG T AGT C ACT C C T AAAGT AG AAACT G G T G CAT C AC C AGAGC A 
T GT AT C AG C T C C AG C AGT T C C T GT GAC T ACGAC T T C AC C AG CT AC AG AC A 
GTAAGTTACAAGCGACTGAAGTTAAGAGCGTTCCGGTAGCACAAAAAGCT 
C C AAC AG C AAC AC CG GT AG C AC AAC C AG C T T C AAC AAC AAAT G C AG TAG C 
T GC ACAT CCT GAAAAT GCAGGGCTCC AAC CT CAT GTTGCAGCT TAT AAAG 
AAAAAGT AG C G T C AAC T T AT GGAGT T AAT GAAT T C AG T AC AT AC CGT GC G 
GG AGAT C C AGGT GAT CAT GGT AAAG GT T TAG C AG T T GAC T T TAT T GT AGG 
TACTAATCAAGCACTTGGTAATAAAGTTGCACAGTACTcTACACAAAATA 
TGGCAGCAAATAACATTTCATATGTTATCTGGCAACAAAAGTTTTACTCA 
AAT AC AAAC AGT AT T TAT G GAC C T G CT AAT ACT T GG AAT G C AAT G CC AGA 
TCGTGGTGGCGTTACTGCCAACCACTATGACCACGTTCACGTATCATTTA 
ACAAATAATATAAAAAAGGAAGCTATTTGGCTTCTTTTTTATATGCCTTG 
AATAGACTTTCAAGGTTCTTATATAATTTTTATTA 

SEQ ID NO. 6906 

STRAIN COH1 
CTGATTT 

GGT AAAG C AAG AC AAT AAAT CAT CAT AT ACT GT GAAAT AT GGT GAT AC AC 
T AAGC GT TAT T T C AG AAGC AAT GT C AAT T GAT AT GAAT GT C T T AG C AAAA 
AT T AAT AAC AT T G C AG AT AT C AAT C T TAT T T AT CCT GAG AC AAC ACT GAC 
AGT AACT TACGAT C AGAAG AGT C AT ACTGCC ACT T C AAT GAAAAT AG AAA 
C AC C AG C AAC AAAT GCTGCTGGT C AAAC AAC AG c T AC T GT C GAT T T G AAA 
AC C AAT C AAGT TTTTGTTG C AG AC C AAAAAGT T T cT CT C AAT AC AAT T T C 
G G AAG GT AT GAC AC C AG a a G C AG C AAC AAC GAT TGTTTCGC C AAT G AAG A 
C a TAT TCTTCTGCGC C AG C T T T GAAAT C AAAAG AAG TAT TAG C AC AAGAG 
CAAGCTGTTAGTCAAGTAGCAGCTAATGAACAGGTATCACCAGCTCCTGT 
GAAGTCGATTACTTCAGAAGTTCCAGCAGCTAAAGAGGAAGTTAAACCAA 
CTCAGACGTCAGTCAGTCAGTTAACAACAGTATCACCAGCTTCTGTTGCC 
G CT G AAAC ACC AG CT C C AGT AG CT AAAGT AG C AC C GG T AAG AAC T GT AGC 
AGCCCCTAGAGTGGCAAGTGcTAAAGTAGTCACTCcTAAAGTAGAAACTG 
GTGCATCACCAGAGCATGTATCAGCTCCAGCAGTTCCTGTGACTACGACT 
T C AC C AG C T AC AG AC AG T AAGT T AC AAG C GAC T G AAGT T AAG AG C GT T C C 
GGT AG C AC AAAAAGC T CC AAC AG C AAC AC C G G TAG C AC AAC C AG C T T C AA 
C AAC AAAT G C AGT AG C T G C AC AT CCT GAAAAT G C AG GG CT C C AAC CT CAT 
GT T G C AGC T TAT AAAG AAAAAGT AGC GT C AAC TT AT G G AGT T AAT GAAT T 
C AGT AC AT AC CG T G C GGG AG AT C C AGG T GAT CAT GGT AAAGGTT T AG C AG 
T T G ACT T TAT T G T AGGT AAAAAC C AAG C ACT T G GT AAT G AAG T T G C AC AG 
TaCTCTACACAAAATATGGCAGCAAATAACATTTCATATGTTATCTGGCA 
AC AAAAGT T T TAT T C AAAT AC AAAT AGT AT T TAT G GAC C T G C T AAT ACT T 
G GAAT G C AAT G C C AG AT CGTGGTGG C GT T AC T G C C AAC C AC TAT G ACC AC 
GTTCACGTATCATTTAACAAATAATATAAAAAAGGAAGCTATTTGGCTTC 
TTTTTTATATGCCTTGAATAGACTTTCAAGGTTCTTATATAATTTTTATT 
A 

SEQ ID NO. 6907 

STRAIN M732 

CTGATTTGGTAAAGCAAGACAATAAATCATCATATACTGTGAAATATGGT 
G AT AC AnT AAGCGTT ATT TCAGAAGCAATGTCAATT GAT AT G AAT GTCTT 
AGCAAAAAT T AAT AAC ATTGCAGATAT C AAT CT T AT TT AT CCT GAGAC AA 
CACTGAC AGT AAC T T AC GAT C AGAAG AGT C At ACTGCC ACTTCAATGAAA 
AT AG AAAC AC C AG C AAC AAAT G CT GC T GGT C AAAC AAC AG CT ACT GT c G A 
TTTGAAAACCAATCAAGTTTTTGTTGCAGACCAAAAAGTTTCTCTCAATA 
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C AAT T T C GG AAGG T AT GAC AC C AGAAG C AG C AAC AAC GAT TGTTTCGC C A 
AT GAAG AC AT AT TCTTCTGCGC C AG CT T T GAAAT C AAAAGAAGT AT T AG C 
ACAAGAGCAAGCTGTTAGTCAAGTAGCAGCTAATGAACAGGTATCACCAG 
CT C C T GT G AAGT C G AT T AC T T C AG AAGT T C C AG C AG CT AAAG AGG AAGT T 
AAAC C AAC T C AG AC GT C AGT C AGT CAGT T AAC AAC AGT AT C AC C AG C T T C 
TGTTGCCGCT G AAAC AC C AG CT C C AG TAG C T AAAGT AG C AC CG G T AAG AA 
CTGTAGCAGCCCCTAGAGTGGCAAGTGCTAAAGTAGTCACTCCTAAAGTA 
GAAACTGGTGCATCACCAGAGCATGTATCAGCTCCAGCAGTTCCTGTGAC 
T ACGACT T C AC C AG C T AC AG AC AGT AAGT TAG AAG C GAC T GAAG T T AAG A 
GCGTTCCG GT AGC AC AAAAAG CT C C AAC AG C Aa C AC C G GT AG C AC AAC C A 
GC T T C AAC AAC AAAT G C AGT AGC T G C AC AT C CT G AAAAT G C AG GG CT C C A 
AC C T C AT GT T GC AG C T T AT AAAG AAAAAGT AG C GT C AAC T T AT GG AGT T A 
AT GAAT T CAGT AC AT AC C G T G C GG G AG AT C C AG GT GAT CAT GGT AAAG GT 
TTAGCAGTTGACTTTAttgtaggtaaaaaccAAGCACTTGGTAATGAAGT 
T G C AC AGT ACT c T AC AC AAAAT AT GG C AGC AAAT AAC AT T T C AT AT GT T A 
TCTGGCAACAAAAGTTTTATTCAAATACAAATAGTATTTATGGACCTGCT 
AATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCCAACCACTA 
TGACCACGTTCACGTATCATTTAACAAATAATATAAAAAAGGAAGCTATT 
TGGCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGGTTCTTATATAAT 
TTTTATTA 

SEQ ID NO. 6908 

STRAIN M781 

CT G AT T T GG T AAAG C AAG AC AAT AAAT CAT CAT AT ACT GT GAAAT AT GGT 
GAT AC ACT AAG C GT T AT T T C AG AAG C AAT GT C AAT T GAT AT GAAT GT C T T 
AGC AAAAAT T AAT AAC AT T G C AGAT AT C AAT C T TAT T TAT C CT G AG AC AA 
C ACT GAC AGT AAC T T AC GAT C AG AAG AGT CAT AC T GC C AC T T C AAT G AAA 
ATAGAAACACCAGCAACAAATGCTGCTGGTCAAACAACAGCTACTGTCGA 
T T T GAAAAC C AAT CAAGTTT T TGTTGC AGACCAAAAAGT TT CT CT CAAT A 
CAATTTCGGAAGGTATGACACCAGAAGCAGCAACAACGATTGTTTCGCCA 
AT GAAG AC AT AT TCTTCTGCGC C AG C T T T GAAAT C AAAAGAAGT AT T AGC 
AC AAG AG C AAG C T GT T AGT C AAG TAG C AG CT AAT G AAC AG G TAT C AC C AG 
CT C CT GT G AAGT C G AT T AC T T C AG AAGT T C C AG C AG CT AAAGAGG AAGT T 
AAAC C AACT C AGACG T CAGT CAGT C AG T T AAC AAC AGT AT C AC C AG C T T C 
TGTTGCCGCT G AAAC AC C AGC T C CAGT AG C T AAAG TAG C AC C G GT AAG AA 
CT GT AGC AG C C C CT AGAG T G G C AAGT G C T AAAGT AGT C AC T C CT AAAG T A 
GAAACTGGTGCATCACCAGAGCATGTATCAGCTCCAGCAGTTCCTGTGAC 
T AC G ACT T C AC C AG CT AC AGAC AGT a a GT T AC AAG C GAC T G AAGT T AAG A 
GC GT T C CGGT AG C AC AAAAAG C T C C AAC AGC AAC AC C GGT AGC AC AAC C A 
GCTTCAACAACAAATGCAGTAGCTGCACATCCTGAAAATGCAGGGCTCCA 
AC CT CAT GT T G C AG C T TAT AAAG AAAAAGT AG C GT C AACT TAT GG AG T T A 
AT GAAT T C AG T AC AT AC CG T G C G G G AGAT C C AG GT G AT CAT GGT AAAG GT 
T TAG C AGT T G ACT T TAT T GT AGGT AAAAAC C AAG C ACT T GGT AAT G AAGT 
T G C AC AGT ACT C T AC AC AAAAT AT GG C AG C AAAT AAC AT T T CAT AT GT T A 
T C T G GC AAC AAAAGT T T TAT T C AAAT AC AAAT AGT AT T TAT G GAC C T G C T 
AAT AC T T G GAAT G CAAT G C C AG AT CGTGGTGGCGT T AC T G C C AACC AC T A 
T GAC C AC G T T C AC GT AT CAT T T AAC AAAT AAT AT AAAAAAGG AAGC T a T T 
T GG CTT CT T T T T T AT AT GC C T T G AAT AgAC T T T C AAGGT T C T TAT AT AAT 
TTTTATTA 

SEQ ID NO. 6909 

STRAIN CJB110 

C T GAT T T GG T AAAG C AAG AC AAT AAAT CAT C AT AT ACTGTGAAA 

TAT GGT GAT AC ACT AAG C G T TAT T T C AG AAGC AAT G T CAAT T GAT AT G AA 

T GT C TT AG C AAAAAT T AAT AAC AT T G C AG AT AT CAAT C T TAT T TAT C CT G 

AG AC AAC AC T GAC AGT AAC T T AC GAT C AG AAG AGT CAT AC T G C C AC T T C A 

ATGAAAATAGAAACACCAGCAACAAATGCTGCTGGTCAAACACCAGCTAC 

TGTGGATTT GAAAAC CAAT CAAGTTTcTGTTGCAGACCAAAAAGTTTCTC 

T CAAT AC AAT T T C G G AAG GT AT GAC AC C AG AAG C AG C AAC AAC GAT T G T T 

T CGCCAAT G AAG AC AT ATT CTT CTGCGCCAGCTTTG AAAT C AAAAGAAGT 
AT TAG C AC AAG AG C AAG C T GT T AGT C AAG C AG C AG CT AAT G AAC AG GT AT 
C AAC AG C T C C T G T G AAGT C G AT T AC T T C AG AAG T T C C AG C AG C T AAAG AG 
G AAGT T AAAC C AAC T C AG AC GT CAGT CAGT C AG T C AAC AAC AGT AT C AC C 
AgCTTCTGTTGCCGCTGAAACACCAGCTCCAGTAGCTAAAGTAGCACCGG 
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TAAgAACTGTAGCAGCCCCTAGAGTGGCAAGTGTTAAAGTAGTCACTCCT 
AAAGT AG AAAC T GGT G CAT C AC C AGAG CAT GT AT CAG CT C C AG C AGT T C C 
TGTGACT ACGACTT C AACAGc T ACAGACAGT a AGTT a C AAGCGACT GAAG 
T T AAGAG CGT T C C GGT AG C AC AAAAAG CT C C AAC AG C AAC AC CGGT AGC A 
CAAC C AGCTT CAACAACAAATGCAGTAGCTGCACATCCTGAAAAT GCAGG 
GCTCCAACCTCATGTTGCAGCTTATaAAGAAAAAGTAGCGTCAACTTATG 
GAGT T AAT GAAT T C AGT AC AT a C CGT G CAG GT GAT C C Ag GT G AT CAT GGT 
AAAGGT T TAG C AGT c G AC T T T AT T GT Ag GT AAAAACC AAG C AC T T G GT AA 
T G AAGT T G C AC AGT ACT C T AC AC AAAAT AT GG C AG C AAAT AAC AT T T CAT 
AT GT TAT C T G G C AAC AAAAGT T T TAG T C AAAT AC AAAT AGT AT T TAT G G A 
CCTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCCAA 
CCAT TAT G AC CAT G T T C AC G T AT CAT T T AAC AAAT AAT AT AAAAAAG G AA 
GCTATTTGGCTTCTTTTTTATATGCCTTGAATAGACtTTCAAGGTTCTTA 
TATAATTTTTATTA 

SEQ ID NO. 6910 

STRAIN 1169NT 
CTGATTTG 

GT AAAGC AAG AC AAT AAAT CAT C AT AT ACT GT GAAAT ATGGT GAT AC ACT 
AAGCGTT ATTT CAGAAGCAAT GT CAATTGAT AT GAAT GT CTT AGC AAAAA 
T T AAT AAC AT T G C AGAT AT C AAT CT TAT T TAT C c T GAG AC AAC AC T G AC A 
GT AACTTACGAT CAgAAGAGT CAT ACTGCCACT T CAAT GAAAATAGAAAC 
AC C AGC AAC AAAT GCTGCTGGT C AAAC AAC AG C TAG T GT GG AT T T G AAAA 
CCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTCTCTCAATACAATTTCG 
G AAGGT AT G AC AC CAG AAG CAg CAAC AAC GAT T GT T T C G C CAAT G AAGAC 
ATATTCTTCTGCGCCAGCTTTgAAATCAAAAGAAGTATTAGCACAAGAGC 
AAGCTGTTAGTCAAGCAGCAGCTAATGAACAGGTATCACCAGCTCCTGTG 
AAG T CG AT T AC T T CAg AAGT T C CAg CAG C T AAAG AG G AAGT TAG AC C Aa C 
TcAGACGTCAGTCAGTCAGTCAACAACAGTATCACCAgCTTCTGTTGCCG 
CTGAAACACCAGCTCCAGTAGCTAAAGTAGCACCGGTAAGAACTGTAGCA 
G C C C C AGC C C C T AGAGT G G C AAGT G C T AAAGT AGT C ACT C C T AAAGT AG A 
AAcTGGTGCATCACCAGAGCATGTACCAGCTCCAGCAGTTcCTGTGACTA 
c G AC T T CAAC AG C T AC a G AC Aa T a AG T T AC AAG C G AC T G AAGT TAAg AGC 
GtTCCGGTgGCACAAAAAGCTCCAACAGCAACACCGGTaGCACAACCAGC 
T T c AACAAC AAAT GCAGTAGc T GCAC AT C CT GAAAAT GCAGG ACT CCAAC 
CTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAACTTATGGAGTTAAT 
GAAT T C AGT AC AT aC C GT G C G GG AGAT CC AGG T GAT CAT G GT AAAGGT T T 
AG C AGT T G AC T T T AT T GT a g GT AAAAAC C AAG C AC T T GGT AAT G AAGT TG 
C AC AGT AC T C T AC AC AAAAT AT GG CAG C AAAT AAC AT T T CAT AT GT TAT C 
T G G CAAC AAAAGT T T T ACT C AAAT AC AAAT AGT AT T T AT GG AC CT G C T AA 
TACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCCAACCACTATG 
AC C AC GT T C AC GT AT CAT T T AAC AAAT AAT AT AAAAAAG GAAG C T AT T T G 
GCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGGtTCTTATATAATTT 
TTATTA 

SEQ ID NO. 6911 

strain jm9130013 

c t gat t t g g t aaagc aag ac aat aaat cat cat at act 

g tg aaat at ggt gat ac ac taag c gt t at t t cag aag caat gt caat t g a 

tat gaat gt c t tag c aaaaat aaat aac at t g cag at at caat ct t at t t 

at c c t gag ac aac ac t g ac agt aac t t ac gat cag aag ag t cat ac t g c c 

acttcaatgaaaatagaaacaccagcaacaaatgctgct;ggtcaaacaac 

agc tag t gt g g at tt g aaaac caat c aagt t t c t gt t g cag ac c aaaaag 

t t t ct c t caat ac aat t t c gg aag gt at g ac ac cag aag cag caac aac g 

attgtttcgccaatgaagacatattcttctgcgccagctttgaaatcaaa 

agaagtattagcacaagagcaagctgttagtcaagcagcagctaatgaac 

AGG TAT C AC CAG C T C C T GT GAAG T C GAT TAG T T CAG AAG T T C C AG C AG CT 
AAAG AG GAAG T T AAAC C AAC T CAG AC G T CAG T CAG T CAG T CAAC AAC AG T 
AT C AC CAg CTTCTGTTGCCGCT G AAAC AC C AG CT C CAG T AGC T AAAGT AG 
CACCGGTAAGAACTGTAGCAGCCCCTAgAGTGGCAAGTGTTAAAGTAGTC 
ACT C C T AAAGT AG AAACT GGT G C AT C AC CAG AG CAT GT AT CAG C T C CAG C 
AGTTCCTGTGACTACGACTTCACCAGCTACAGaCAGTAAGTTACAAGCGA 
c T GAAG T TAAG AG C G T T C C GGT AG C AC AAAAAG C T C CAAC AG CAAC AC C G 
GT AG C a CAAC CAG CTT CAAC AAC AAAT GC AGT AG CT G C AC AT C CT G AAAA 
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TGCAGGGCTCCAACCTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAA 
CTTATGGAGTTAATGAATTCAGTACATACCGTGCGGGAGATCCAgGTGAT 
C AT GGT AAAG GT T T AGC AGT T G AC T T T AT T GT AGGT ACT AAT C AAG C AC T 
T GGT AAT AAAGT T G C AC AGT ACT C T AC AC AAAAT AT G G C AGC AAAT AAC A 
T T T C AT AT GT TAT CT GG C AAC AAAAGT T T T AC T C AAAT AC AAAC AGT AT T 
TATGGACCTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTAC 
TGCCAACCACTATGACCACGTT CACGT AT C AT TT AAC AAAT AAT ATAAAA 
AAGGAAGCTATTTGGCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGG 
T T C T TAT AT AAT T T T TAT T A 

SEQ ID NO. 6912 
STRAIN 2 603 frame: 1 

MNKKVLLTSTMAASLLSVASVQAQETDTTWTARTVSEVKADLVKQDNKSSYTVKYGDTLS 
VISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSHTATSMKIETPATNAAGQTTA 
TVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVSFMKTYSSAPALKSKEVLAQEQAVSQ 
AAANEQVS PAPVKS ITS E VPAAKEE VKPT QT SVSQSTTVS P AS VAAE T PAPVAKVAP VRT 
VAAP RVA S VKWT PKVE T GAS PE H V S A P AV P VT T T S PAT D S KLQ AT E VK S V P VAQKA P T A 
TPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVNEFSTYRAGDPGDHGKGLAVD 
FIVGTNQALGNKVAQYSTQNMAANNISYVIWQQKFYSNTNSIYGPANTWNAMPDRGGVTA 
NHYDHVHVS FNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6913 

STRAIN 0 90 frame: 2 

ETTLTVTYDQKSHTATSMKIETPATNAAGQTPATVDLKTNQVSVADQKVSLNTISEGMTP 
EAATTIVSPMKTYSSAPALKSKEVLAQEQAVSQAAANEQVSTAPVKSITSEVPAAKEEVK 
PTQT SVSQSTTVS PAS VAAET PAPVAKVAP VRT VAAPRVASVKVVTPKVETGAS PEHVSA 
PAVPVTTTSTATDSKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVA 
AYKEKVASTYGVNEFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNIS 
YVIWQQKFYSNTNSIYGPANTWNAMPDRGGVTANHYDHVHVSFNK . YKKGS YLAS FLYAL 
NRLSRFLYNFY 

SEQ ID NO. 6914 

STRAIN A90 9 frame: 3 

DLVKQDNKSSYTVKYGDTLSVI SEAMS IDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TATSMKIETPATNAAGQTTATVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVS PMKTY 
SSAPALKSKEVLAQGQAVSQAAANEQVS PAPVKS ITSEVPAAKEEVKPTQTSVSQSTTVS 
PAS VAAET PAPVAKVAP VRT VAAPRVASVKVVTPKVETGAS PEHVSAPAVPVTTTSTATD 
SKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENARLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
. S I YGPANTWNAMPDRGGVTANHYDHVHVS FNK . YKKGS YLAS FLYALHRLSRFLYNFY 

SEQ ID NO. 6915 

STRAIN H3 6B frame: 3 

DLVKQDNKSSYTVKYGDTLS VI SEAMS IDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TATSMKIETPATNAAGQTTATVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVS PMKTY 
SSAPALKSKEVLAQGQAVSQAAANEQVSPAPVKS ITSEVPAAKEEVKPTQTSVSQSTTVS 
PAS VAAET PAPVAKVAP VRT VAAPRVASVKVVTPKVETGAS PEHVSAPAVPVTTTSTATD 
SKLQATEVKSVPVAQKAPTAT PVAQPASTTNAVAAHPENARLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
SI YGPANTWNAMPDRGGVTANHYDHVHVS FNK. YKKGSYLASFLYALHRLSRFLYNFY 

SEQ ID NO. 6916 

STRAIN 18RS21 frame: 3 

DLVKQDNKSSYTVKYGDTLSVI SEAMS I DMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TATSMKIETPATNAAGQTTATVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVS PMKTY 
SSAPALKSKEVLAQEQAVSQAAANEQVS PAPVKS ITSEVPAAKEEVKPTQTSVSQSTTVS 
PAS VAAET PAPVAKVAPVRT VAAPRVASVKVVTPKVETGAS PEHVSAPAVPVTTTS PATD 
SKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGTNQALGNKVAQYSTQNMAANNISYVIWQQKFYSNTN 
S I YGPANTWNAMPDRGGVTANHYDHVHVS FNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6917 

STRAIN M732 frame: 3 

DLVKQDNKSSYTVKYGDTXSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
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TATSMKIETPATNAAGQTTATVDLKTNQVFVADQKVSLNTISEGMTPEAATTIVSPMKTY 
SSAPALKSKEVLAQEQAVSQVAANEQVSPAPVKSITSEVPAAKEEVKPTQTSVSQLTTVS 
P AS V AAE T P AP VAKVA P VRT VAAPRVA S AKWT P KVE T GAS PE H V S AP AV P VT T T S PAT D 
SKLQATEVKSVPVAQKAPTAS PVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
S I YG PANTWNAMP DRGG VT ANH YDHVHVS FNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6918 

STRAIN COH1 frame: 3 

DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TAT SMKI ET PATNAAGQTT AT VDLKTNQVFVADQKVS LNT I S EGMT PEAATT I VS PMKT Y 
S S AP ALK S KE VL AQE Q AV S Q VAAN E Q V S P AP VKS I T S E V P AAKE E VK P TQTSVSQLTTVS 
PASVAAETPAPVAKVAPVRTVAAPRVASAKWTPKVETGASPEHVSAPAVPVTTTSPATD 
SKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
SIYGPANTWNAMPDRGGVTANHYDHVHVSFNK. YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6919 

STRAIN M7 81 frame: 3 

DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TATSMKIETPATNAAGQTTATVDLKTNQVFVADQKVSLNTISEGMTPEAATTIVSPMKTY 
S S APALKSKEVLAQEQAVSQVAANEQVS PAPVKS IT SEVPAAKEEVKPTQT S VS QLTT VS 
PASVAAETPAPVAKVAPVRTVAAPRVASAKWTPKVETGASPEHVSAPAVPVTTTSPATD 
SKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
S I YG PANTWNAMP DRGG VT ANH YDHVHVS FNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6920 

STRAIN CJB110 frame: 3 

DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TATSMKIETPATNAAGQTPATVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVSPMKTY 
SSAPALKSKEVLAQEQAVSQAAANEQVSTAPVKSITSEVPAAKEEVKPTQTSVSQSTTVS 
PASVAAETPAPVAKVAPVRTVAAPRVASVKWTPKVETGASPEHVSAPAVPVTTTSTATD 
SKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
SIYGPANTWNAMPDRGGVTANHYDHVHVSFNK.YKKGSYLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6921 

STRAIN 116 9NT frame: 3 

DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TATSMKIETPATNAAGQTTATVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVSPMKTY 
SSAPALKSKEVLAQEQAVSQAAANEQVS PAPVKS ITSEVPAAKEEVRPTQTSVSQSTTVS 
PAS VAAET PAP VAKVAP VRT VAAPAPRVAS AKWT PKVETGASPEHVPAPAVPVTTTSTA 
TDNKLQATEVKSVPVAQKAPTAT PVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYG 
VNEFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSN 
TNS I YG PANTWNAMP DRGG VT ANH YDHVHVS FNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6922 

STRAIN JM9130013 frame: 3 

DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TAT SMKIET PATNAAGQTT AT VDLKTNQVSVADQKVS LNT I SEGMT PEAATT I VS PMKT Y 
SSAPALKSKEVLAQEQAVSQAAANEQVS PAPVKS IT SEVPAAKEEVKPTQT SVSQSTTVS 
PAS VAAET PAP VAKVAP VRT VAAPRVASVKVVT PKVETGAS PEHVSAPAVPVTTT S PATD 
SKLQATEVKSVPVAQKAPTAT PVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGTNQALGNKVAQYSTQNMAANNISYVIWQQKFYSNTN 
S I YGPANTWNAMPDRGGVTANHYDHVHVS FNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID. NO. 7001 
STRAIN 2603 

AT G GGAG GG AAAAT G AAT C AAGAAGT C T T AC T AC AAAT GAT GAG AG C C ACT AT T C C T C 
G T GAT AG AG CCTTGCTT GAG G CAT T T T T AT AT T AC C AAG C AG AG CAT T T T GAT GAG G AGT 
G G GAT AG T C T TAT T CAT C AG T T T AT G AC C AAT AG G CAAG AAAT AAAT AAGT C T GT T C AAG 
TACTTCACTTTGAGACAGATGTTTCAGCTTTTGTCCAGGCTAGTCCTTATGATACTGCTC 
AT GAT CT AT T G AC C TAT AC AC AAGT T T T C G GC C AAAG T G GT C T T C AAAAAC T AG AT AAAC 
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TATCGCCGTCTGAAAAAAACTTGGTGATAGAAGTGGCCTTGTTCAATCTGGCCACTCGTT 
TTCAATTATTGGATTCCAATGGACACTACCAAACCATATCGCCGGATTCACTCTTACAAA 
AGAGTAGGGGAGCTAATTTGGTCAATGTGTATCGTGTGGCTAATAATTTAGCGGATCGTA 
T T AGT C G AGAT AT T G AAC AG T T T CT CT T AACT T AC GAG C CT GAG CT T GAAAC T AG AGC T G 
AT G AAACT GT T C TAG AAAAT GAAG AAACT G TT GAT GAG C AC AAAAC AAGT GT T CAT C AAG 
C AAT AT CT T T T C GAG AAG AG GGCTCTCTGGT TAT T G C T AGT T T GG AT GT AGAT TT GT CT C 
AACT AG AT GT T C AAAT AG G AAAAAC C AGT CAT C T G C C AG CT T AT GAAG AGT TAT C CT T AC 
G AC GT AAAT T T GAG AT T C T AAC AT AT T T T G AC C AAAT T CG AAAT G AACGT T C CAAAGT C C 
C AAGT T T T AGAC G AGGT GAT T T T G AC AC AG AG AT G GAAAT G AC AC C AGT C T T T GAT G GC G 
AGGAATTACTTACTTATCTCGAAGCTGATGGCAGTCCCTATGAGCTGAAACGAACGCTGA 
C T AC AGT C GAAG AAAAG G AAT T AG AAAAAAT T GG AC AAG C CAT T AGG AT AG AAAAT C AAG 
AAAAATTGACTCAGCTAGGGATTGATTTATCTCAGTTTGACCCAGACCGAGTCGGTATTT 
TATTGGATGCAGCAGGTCGTTTTCGTTTAAAAAATGCAGACCTTGCTTTACTAGGTGGTT 
AT C C C AAAG C CT C GGT AAC T C AAC T AGC C CT T G C G AC AG AACT AC T C C AAAT GGG ACT AA 
GTCATGAAAAGGTTGAATTTTTCTTTGGTAGCCAGCTTTCCATTGAAGAGCTGCGACAAG 
TTGC CT ACGCCTT T TT AT ACC AAGAACT CAGCAGAGAAGAT GCGGAGCAAT TT GAAAAAG 
AT AAAGGT AAT CAGC CAGAT TT AACT CT CAGAGATTGGAAAAGCAAGCTAGAGAAAGCT G 
AG G G AAAAGAAGT AGT T GAT GAAG AAT T C GCGG AAAAT C C AC T G GT T C AG AG AGT AT T G G 
AC AC T TAT CCTCTGGGGT CAT T GG T T T C C TAT AAG G G AC AG G AC T T T G AGGT CAT GT C GG 
TCAGCGATGCTCGATTGAACGGTTTGATTCGGATTGAGTTAGTCAATGACTTTTCGGATA 
T C AT T G AAC AAAAT C C AGT T C T T T AT GT G AGG AC CT GGGAAG AAG T C AGT C AGG C ACT T C 
AT C AG C C AAAG G C AG AAC C AC AAAC AG AG T TAG AAG AAG C GG AC C AAG AAT T AAAC CT AT 
T CT C ATTT CT GGAAGAGGAGCC AGTT CAGAGT ATT GGACTATTGGAAC CAGAT GAT T CAG 
AAAAT GGT CAT AACGAT ACT GAT CTT GAAG AAAC AGAT AAT CAAATT C CTGAAGAGGAAG 
TCGTCGAAACAATTCCAGAGATTCCAGTAACGGACTTTTATTTTCCAGAAGATTTGACGG 
ACTTTTATCCTAAGACTGCTAGAGATAAGGTTGAGACAAACATTGTGGCCATTCGTTTGG 
T AAAAAAT C TAG AAG TAG AGC AC CG C AAT GC T T C ACC AAGT G AAC AAG AAC TCCTTGCCA 
AGTATGTAGGCTGGGGTGGACTAGCCAATGAATTTTTTGATGACTATAATCCAAAATTTT 
CTAAGGAACGAGAAGAACTGAAGAGCCTAGTCACAGATAAAGAGTATTCGGATATGAAAC 
AGTCCTCCCTGACAGCCTATTACACAGACCCATCCCTGATCCGTCAGATGTGGGATAAGT 
T G G AAAGAG AT GG CT T T AC AG GT GG C AAAAT C CT AG AT CCT T C CAT GGG AAC AG GG AAT T 
TCTTTGCGGC TAT G C C AAAAC AC T T AAG AG AAAAG AGT G AGT T GT AT G G C G TAG AG T T AG 
AT ACT AT T AC AG GAG CT AT T GCC AAAC AC CT T CAT C C C AAT AGT CAT AT T GAAAT T AAG G 
GATTTGAGACGGTGGCTTTTAACGACAATAGTTTTGATTTGGTGATTTCAAATGTGCCCT 
TTGCCAATATACGAATTGCGGATAATAGGTACGATAGGCCTTACATGATTCATGACTACT 
TTGTCAAAAAGTCACTTGATTTGCTTCATGATGGTGGACAAGTAGCGATTATCTCTTCCA 
CAG G AAC TAT GG AT AAG C G AAC AGAAAAC AT C T T AC AAG AT AT T CG T GAG AC AAC T G AAT 
TTCTTGGTGGGGTTCGACTGCCTGACTCTGCCTTTAAGGCCATTGCAGGAACGAGTGTCA 
C AAC G GAT AT GT T AT T C T T C CAG AAAC ACT TAG AC AAGG GAT AT G T GG CAG AC GAT T T AG 
CCTTTTCAGGTTCCATTCGCTATGACAAGGATAGTCGCATTTGGCTCAATCCTTATTTTG 
ATGGAGAATACAATAGCCAGGTGCTAGGAACCTACGAGGTCAGGAATTTTAACGGAGGAA 
CACTTTCTGTTAAGGGGACTAGTGATGACTTGATTGCAAGTGTTGAAACAGCTCTAAATC 
ACGTTAAGGCCCCAAGAGAGATTGATAGAAATGAGGTCATCATTAACCCAGATGTGTTGA 
CCAAACAAGTCAATGATACCTCCATTCCAGCTGAAATGAGGGAAAATCTAGGTCAGTACA 
G T T T T G GT T AT C AGGGGT C T AC AGT T TAG TAT C GAG AT AAC AAAG G C AT T C G AGT C G G AA 
CCAAGACGGAAGAAATCAGTTACTATGTCGATGAAGAGGGCAACTTCAAAGCATGGGACA 
C C AAAC AT T C T C AAAAG CAGAT T GAT CG C T T T AAT GCC T TAG AAGT G AC T GAT AAC ACT G 
CTCTGGATGTCTATGTGACCGATGATGCAGCCAAACGTGGTCAGTTTAAGGGGTATTATA 
AAAAGACAGTTTTCTATGAAGCTCCATTGTCTTATAAAGAAGTGGCACGTATCAAAGGAA 
T G GT CG AT AT T C G C AAT GC C T AC C AAG AAGT TAT T G C CAT T C AAC G C TAT TAT GAC T AT G 
ATAAGGAGACCTTTAACCACTTGTTAGGCAAACTCAATCGTACCTATGATAGCTTTGTCA 
AAC AC TAT GG GT AT T T G AAT AGT G CT GT G AAC CGC AAT C T T T T T GAT AGT GAT GAT AAG T 
ATTCGCTTCTTGCTAGTTTGGAAGATGAAAGTCTGGATCCAAGTGGAAAGTCTGTTATCT 
ATACTAAATCCCTTGCCTTTGAGAAGGCTCTAQTGCGTCCTGAAAAAGAGGTTAAAAAGG 
TGCATACTGCCCTTGATGCCTTAAATTCGAGCTTGGCTGACGGACGAGGTGTTGATTTCG 
C T TAT AT G AT GT C TAT CT AT C AG GT T G AAT CG CAGAT G AC CT T GAT T GAG G AGT T AGG C G 
AC CT CAT TAT GCC T G AT CCT GAG AAGT AT T T G AAT GG AG AAT T G AC CT AT GTTTCTCGCC 
AAG AC T T T CT T T C AGG GG AT G T C G T C ACT AAGT TAG AAGT GGT AG AT C TAT T CGT C AAAC 
AAGACAATCAGGACTTTAACTGGTCACATTATGCGGGACTTCTAGAAGCTATCAAACCAG 
CCCGTATTACTTTGGCAGACATTGATTATCGAATCGGTTCACGCTGGATTCCTCTGGCTG 
TTTATGGAAAATTTGCCCAAGAAACCTTTATGGGGAAAGCCTATGAACTGTCAGACCAAG 
AAGT AG C GAC AGT C CT AG AAG T C AGT C C CAT T G ACGG G GT T AT C AC T T AC C AAT CT AAGT 
TTGCCTACACCTATTCCAACGCAACGGATAGGAGTTTAGGTGTCCCTGCTTCACGCTATG 
AT AG T G GT C G AAAAAT C T T T G AAAAT C T C CT G AAT T C C AAT C AAC C AAC CAT C AC AAAAC 
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AAGT T GT C G AAGG GGAT AAG AAAAAGAAT GT GAC G GAT G TAG AG AAAAC AAC GGTCCTGC 
GTGCCAAGGAAACACACCTACAAGAACTCTTTCAAGGTTTTGTAGCAAAGTATCCAGAAG 
T C C AAC AAAT GAT T G AAG AC AC C T AT AAT AGGC T CT AC AAT C GT AC GGT AT C AAAG T C CT 
AT GAT GG T AGT CAT T T AAC CAT T GAT GGACT T G CT CAG AAT AT C T C C T T AC GT C C T C AC C 
AAAAG AAT G CC AT T C AACGAATT GT CG AGGAAAAACGT GCT CT ACT AG C T CAT G AAGT T G 
GTTCAGGTAAAACACTTACCATGCTTGGGGCAGGATTCAAACTGAAAGAACTCGGAATGG 
TACATAAACCACTTTATGTGGTGCCGTCTAGTCTGACTGCTCAGTTTGGTCAAGAAATCA 
T G AAAT T T T T C C C T AC C AAG AAAGT CT AT GT G ACT ACT AAG AAAG ACT T T G C C AAAGC C A 
AAC GC AAGC AGT TT G T GT C C C G TAT TAT T AC AGGG G AC TAT GAT G C CAT T GT CAT T G G G G 
AT T C AC AAT T T GAG AAG AT AC C GAT G AGT CGT GAAAAAC AG GT C AC C T AT AT C AAT G AC A 
AAC T T GAG C AACT C C GAG AAAT C AAG C T AG G AAGT GAC AGT G AT T AC AC GG T G AAAG AAG 
CGGAACGTTCGATTAAGGGATTAGAACACCAGTTGGAAGAACTCCAAAAACTAGAGCGAG 
ATACCTTTATTGAGTTTGAAAACCTTGGAATTGATTTTCTTTTTGTGGATGAGGCTCATC 
AC T T C AAGAAT AT C C GT C C AAT C AC T GGACT T G GG AAT GTAGC T GG AAT C AC C AAC AC AA 
CT T CT AAAAAG AAC GT G GAT AT GG AG AT GAAGGT GAG AC AAGT AC AG G CAG AG C AT GG AG 
AT AG AAAT GTCGTTTTT G CGAC AGG AAC AC C AGT T T C T AACT CT AT T AGT G AAC T T T T C A 
C CAT GAT GGAT T AC AT T C AAC C T GAT GT CTT G G AAC GAT AC C T GG TAT C AAAT T T T GAC T 
CCTGGGTTGGGGCTTTTGGGAATATCGAAAACTCCATGGAACTAGCCCCGACAGGAGATA 
AGTACCAACCCAAGAAACGGTTCAAGAAATTTGTCAACCTTCCTGAACTCATGCGAATCT 
AC AAGG AAACT G C C GAT AT T CAG AC C T C AGAC AT G CT T GAT T T AC C AGT AC CGG AAGCT A 
AG AT TAT T G C GGT GG AAAGC G AGT T AAC GC AAG C T CAG AAAT AC TAT T T G GAAG AG CT GG 
TAAAGCGTTCAGACGCTATCAAGTCAGGTAGTGTTGATCCAAGTAGAGATAACATGCTTA 
AAATCACAGGAGAAGCCAGAAAACTAGCTATTGATATGCGGTTGATTGACCCTACTTACT 
C C T T AT CGG AT AAT C AG AAAAT C C T T C AAGT AGT CG AT AAT GT C GAG C GGAT TTACCGTG 
ATGGAGCTGGAGACAAAGCCACTCAGATGATTTTCTCAGATATTGGAACCCCTAAAAGTA 
AG G AAG AAGGGT T T G AT GT C T AC AAT G AAC T T AAG GAC T TGTTTGTC GAT C GAG GGAT AC 
C AAAAG AAGAAAT TGCCTTTGTC CAT G AT GC C AAT AC T GAT G AG AAG AAAAAC T CT CT GT 
CACGCAAGGTCAATAGTGGAGAAGTACGGATTCTCATGGCTTCTACGGAAAAAGGGGGAA 
CAGGATTAAACGTCCAATCTCGCATGAAAGCTGTCCACTATTTAGACGTTCCCTGGAGGC 
C CT C AG AC AT T GT C C AGC G AAAT G GAC GAC T AAT T CGAC AAGGAAAC AT G C AC C AGG AGG 
T AGAT AT T TAT C AC TAT AT T AC T AAAGG GAGC T T T GAC AAT T AC C T CT GG C AG AC G C AGG 
AG AAT AAG C T AAAGT AT AT C AC C C AGAT AAT G AC CT CAAAAG AT C C T GT GAG AT CAG C T G 
AAG AC AT T GAT G AAC AAAC CAT GAC C G C C T CAG ACT TT AAGG CAT T G G C AAC T G GG AAC C 
C T TAT C T C AAAC T C AAAAT G G AGT T G GAAAAT G AAC T GAC AGT T T T AG AGAAT C AAAAAC 
GAG C CT T T AAT C G C T C C AAAGAC GAG TAT C GC C AT AC CAT T T C C TAT AG C GAG AAG C AC C 
TCCCTATTATGGAAAAACGGTTGAGTCAATATGATAAAGATATTGCCCAATCTTTGGCAA 
CCAAGTCGCAAGATTTTGTCATGCGATTTGACAATCAAGCAATGGATAATCGTGCTGAAG 
CTGGGGACTATCTGCGAAAACTCATTACCTATAACCGCTCAGAGACCAAGGAAGTCAGGA 
C AC T T G C CAG CTT TAG AGG AT TT GAT T T AAAAAT G ACT AC ACGAG G T G C T AGT GAGC C C T 
T AC CAG AAAC CAT T T C T T T AAT GAT T GT AGGT GAT AAC C AGT AT AC TGTCGCCCTT GAT T 
TGAAATCAGACGTGGGAACCATTCAACGGATTAGTAATGCCATTGACCATATTATAGATG 
AC C AAG AAAAGAC G C AAG AG C T GGT AAAGGAT T T AAAAG AT AAG C T AC G AGT AG C C AAAG 
TAGAAGTTGATAAAGTCTTTCCAAAGGAAGAGGACTATCAGCTTGTAAAGGCTAAGTATG 
AT GT T T TAG CTCCCTTGGTT GAAAAAGAAG CAG AGAT T GAAG AG AT AG AT G C AG CT T T GG 
CC AAGT T T AGT GAAG AT AC AAC AC C C C AAAAGAAG C AAC AAAT AG C ACT C GAG AT A 

SEQ ID. NO. 7002 

STRAIN H3 6B 

GG AGG GAAAAT G AAT C AAGAAGT CT TACT AC AAAT GAT 

GAG AG C C AC TAT T C CT C GT G AT AGAG CCTTGCTT GAGG C AT T T T TAT AT T 

ACC AAG CAG AG CAT T T T GAT GAG G AGT GG G AT AG T CT TAT T CAT C AGT T T 

AT GAC C AAT AGG C AAGAAAT AAAT AAG T C T GT T C AAGT AC T T C AC T T T G A 

GACAGATGTTTCAGCTTTTGTCCAGGCTAGTCCTTATGATACTGCTCATG 

ATCTATTGACCTATACACAAGTTTTCGGCCAAAGTGGTCTTCAAAAACTA 

GATAAACTATCGCCGTCTGAAAAAAACTTGGTGATAGAAGTGGCCTTGTT 

CAATCTGGCCACTCGTTTTCAATTATTGGATTCCAATGGACACTACCAAA 

CCATATCGCCGGATTCACTCTTACAAAAGAGTAGGGGAGCTAATTTGGTC 

AATGTGTATCGTGTGGCTAATAATTTAGCGGATCGTATTAGTCGAGATAT 

T GAAC AGT T T C T C TT AAC T TAG GAG C C T GAGCT T G AAAC TAG AG C T GAT G 

AAACT GT T C TAG AAAAT G AAGAAAC T GT T GAT GAGC AC AAAAC AAGT GT T 

CATCAAGCAATATCTTTTCGAGAAGAGGGCTCTCTGGTTATTGCTAGTTT 

GG AT GT AG AT T T GT C T C AAC T AG AT GT T C AAAT AGG AAAAAC C AGT CAT C 

T G C CAG C T T AT GAAG AGT TAT CC T T AC G AC GT AAAT T T GAG AT T C T AAC A 

TATTTTGACCAAATTCGAAATGAACGTTCC AAAGT CCCAAGTTTTAGACG 
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AGGT GAT T T T G AC AC AG AG AT G GAAAT G AC AC C AGT C T T T GAT G GCG AGG 
AAT T AC T TACT TAT CT C GAAG C T GAT G G C AGT C C C TAT GAG C T G AAAC GA 
ACG C T G ACT AC AGT CG AAG AAAAGGAAT T AG AAAAAATT GG AC AAG C CAT 
TAGGAT AGAAAAT CAAGAAAAAT T GACTCAGCTAs GkATTGr TTTAT CT C 
AGTTTGACCCAGACCGAGTCGGTATTTTATTGkATGCAGCAGGTCGTyyT 
CGTTTAwAwAATGCAGACCTTGCTTCACTAGGTGGTTATCCCAAAGCCTC 
GGTAACTCAACTAGCCCTTGCGACAGAACTACTCCAAATGGGACTAAGTC 
ATGAAAAGGTTGAATTTTTCTTTGGTAGCCAGCTTTCCATTGAAGAGCTG 
CGAC AAGT T G C CT ACG C C T T T T T AC AC C AAG AAC T C AG C AGAG AAG AT G C 
GGAG C AAT T T GAAAAAGAT AAAG G T AAT C AG C C AG AT T T AACT C T C AG AG 
AT T G GAAAAGC AAG CT AGAGAAAGCT GAG G GAAAAG AAGT AGT T GAT G AA 
G AAT T CGCG GAAAAT C C ACT G GT T C AGAGAGT AT T GG AC ACT T AT C C T CT 
GGGGTCATTGGTTTCCTATAAGGGACAGGACTTTGAGGTCATGTCGGTCA 
GCGATGCTCGAtTGAACGGTTTGATTCGGATTGAGTTAGTCAATGACTTT 
T C G GAT AT CAT T G AAC AAAAT C C AGT T C T T TAT GT GAG G AC C T GGG AAG A 
AGTCAGTCAGGCACTTCATCAGCCAAAGGCAGAACCACAAACAGAGTTAG 
AAGAAGCGGACCAAGAATTAAACCTATTCTCATTTCTGGAAGAGGAGCTA 
GTTCAGAGTATTGGACTATTGGAACCAGATGATTCAGAAAATGGTCATAA 
CG AT AC T GAT C T T GAAG AAAC AG AT AAT C AAATT C C T G AAG AGGAAG T C G 
T C G AAAC AAT T C C AGAG AT T C C AGT AACGG AC TT T T AT T T T C C AGAAG AT 
TTGACGGACTTTTATCCTAAGACTGCTAGAGATAAGGTTGAGACAAACAT 
TGTGGCCATTCGTTTGGTAAAAAATCTAGAAGTAGAGCACCGCAATGCTT 
CACCAAGTGAACAAGAACTCCTTGCCAAGTATGTAGGCTGGGGTGGACTA 
GC C AAT G AAT T T T T T GAT G AC TAT AAT CC AAAAT T T T CT AAGG AAC GAGA 
AGAAC T GAAG AG C C T AGT C AC AG AT AAAG AG TAT T C G GAT AT G AAAC AGT 
C C T C C C T G AC AG C C T AT T AC AC AG AC C CAT CC CT GAT C CGT C AGAT GT GG 
GAT AAGT T GGAAAG AG AT G G C T T T AC AGGT GGC AAAAT C C TAG AT C CT T C 
CAT G G G AAC AG G G AAT TTCTTTGCG G CT AT G C C AAAAC AC T T AAG AG AAA 
AG AG T G AGT T G TAT G G CGT AG AG T TAG AT AC TAT T AC AG GAG C TAT T G CC 
AAACACCT T CAT C CCAATAGT CAT ATT GAAAT TAAGG GAT TTGAGACGGT 
GGCTTTTAACGACAATAGTTTTGATTTGGTGATTTCAAATGTGCCCTTTG 
C C AAT AT ACGAAT T G C GGAT AAT AG GT ACG AT AGG C C T T AC AT GAT T C AT 
GACTACTTTGTCAAAAAGTCACTTGATTTGCTTCATGATGGTGGACAAGT 
AGCG AT T AT C T C T T C C AC AGG AAC TAT G GAT AAG C G AAC AG AAAAC AT CT 
TACAAGATATTCGTGAGACAACTGAATTTCTTGGTGGGGTTCGACTGCCT 
GAC TCTGCCTT T AAG G CC AT T G C AG G AACGAGT GT C AC AAC GGAT AT GT T 
AT T CT T C C AG AAAC AC T T AG AC AAGGG AT AT GT GG C AGAC GAT T T AGC CT 
TTTCAGGTTCCATTCGCTATGACAAGGATAGTCGCATTTGGCTCAATCCT 
TATTTTGATGGAGAATACAATAGCCAGGTGCTAGGAACCTACGAGGTCAG 
G AAT T T T AAC GGAG GAAC AC T T T C T GT T AAGGGG AC T AGT GAT GAC T T G A 
T T G C AAGT GT T G AAAC AG C T C T AAAT C ACGT T AAGG C C C C AAGAG AG ATT 
GAT AGAAAT GAGGT CAT CAT T AAC CC AGAT GTGTT GAC C AAAC AAGT C AA 
T GAT AC C T C CAT T C C AG C T GAAAT GAG G GAAAAT CT AGG T C AGT AC AG T T 
T T GGT T AT C AGGG GT C T AC AG T T T AC TAT C G AGAT AAC AAAG G CAT T C G A 
GT C G G AAC C AAG AC G G AAGAAAT C AGT TAG TAT GT C GAT GAAG AG 

SEQ ID. NO. 7003 

STRAIN 18RS21 

GnAGGGAAAATGAATCAAGAAGTCTTACTACAAATGATGAGA 
GCCACTATTCCTCGTGATAGAGCCTTGCTTGAGGCATTTTTATATTACCA 
AGC AG AG CAT T T T GAT G AGG AGT G GGAT AGT C T T AT T CAT C AG T T TAT G A 
CCAAT AGGC AAG AAAT AAAT AAGT CTGTTC AAGT ACTTC ACT TTGAG AC A 
GATGTTTCAGCTTTTGTCCAGGCTAGTCCTTATGATACTGCTCATGATCT 
ATTGACCTATACACAAGTTTTCGGCCAAAGTGGTCTTCAAAAACTAGATA 
AACTATCGCCGTCTGAAAAAAACTTGGTGATAGAAGTGGCCTTGTTCAAT 
CTGGCCACTCGTTTTCAATTATTGGATTCCAATGGACACTACCAAACCAT 
AT C GC CGG AT T C AC T C T T AC a AAAGAGT AG GGG AG C T AAT T T G G T C AAT G 
T GT AT CGTGTGGC T AAT AAT T T AG C G GAT C GT AT T AG T C GAG AT AT T G AA 
CAGTTTCTCTTAACTTACGAGCCTGAGCTTGAAACTAGAGCTGATGAAAC 
TGTTCTAGAAAATGAAGAAACTGTTGATGAGCACAAAACAAGTGTTCATC 
AAGCAATATCTTTTCGAGAAGAGGGCTCTCTGGTTATTGCTAGTTTGGAT 
GT AG ATT T GT C T C AACT AG AT GT T C AAAT AGGAAAAAC C AGT CAT CT GC C 
AG CT TAT GAAG AGT TAT C CT T AC G AC GT AAAT T T GAG AT T C T AAC AT AT T 
TTGACCAAATTCGAAATGAACGTTCCAAAGTCCCAAGTTTTAGACGAGGT 
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GAT T T T GAC AC AG AG AT GG AAAT G AC AC C AGT CT T T GAT G GC GAG G AAT T 
ACT TAG T TAT C T C GAAG C T GAT G G C AGT C C CT AT GAG CT GAAACG AACGC 
T G ACT AC AG t c G AAG AAAAGG AAT T AG AAAAAAT T GG AC AAG C CAT T AGG 
AT AG AAAAT C AAGAAAAAT T GAC T C AG CT AGGG AT T GAT T T AT C TC AGT T 
T GACC C AG AC C GAGT C G GT AT T T TAT T G GAT G C AG C AGGT CGTTTTCGTT 
TAAAAAATGCAGACCTTGCTTTACTAGGTGGTTATCCCAAAGCCTCGGTA 
ACT C AACT AG CC CT T G C G AC AGAACT AC T C C AAAT GGG ACT AAGT CAT GA 
AAAGGTTGAATTTTTCTTTGGTAGCCAGCTTTCCATTGAAGAGCTGCGAC 
AAG T T GCC T ACG C C T T T T T AC AC C AAG AACT C AG C AG AG AAG AT G C GG AG 
CAATTTGAAAAAGATAAAGGT AAT CAGCCAGAT TT AACT CT CAGAGAT T G 
G AAAAG C AAG C T AG AG AAAG CT G AGGGAAAAG AAGT AGT T GAT GAAG AAT 
TCGCGGAAAATCCACTGGTTCAGAGAGTATTGGACACTTATCCTCTGGGG 
T CAT T G GT T T C CT AT AAG GG AC AGGAC T T T G AGGT CAT GT C G GT C AG C G A 
TGCTCGATTGAACGGTTTGATTCGGATTGAGTTAGTCAATGACTTTTCGg 
AT AT CATTGAACAAAAT C CAGT TC t TT At GT GAGGACCT GGG AAGAAGTC 
AGT C AGG C AC T T CAT C AG C C AAAGG C AG AAC C AC AAAC AG AGT TAG AAG A 
AG CGG AC C AAGAAT T AAAC C T AT T CT CAT T T C T G GAAG AGG AG C C AGT T C 
AG AGT AT T G G ACT AT T GG AAC C AG a T GAT T C AGAAAAT G GT CAT AAC GAT 
ACT GAT CTT GAAGAAACAGAT AAT CAAATT CCTGAAGAGGAAGT CGT CGA 
AAC AAT T C CAGAGAT T C C AGT AACGG ACT TT T AT T T T C C AGAAGAT T T GA 
C GG ACT T T T AT C C T AAG ACT GC T AG AG AT AAG GT T GAGAC AAAC AT T GT G 
G C CAT T C GT T T GGT AAAAAAT C T AGAAGT AG AG C AC CG C AAT G C T T C AC C 
AAGTGAACAAGAACTCCTTGCCAAGTATGTAGGCTGGGGTGGACTAGCCA 
AT G AAT T T T T T GAT GAC TAT AAT C C AAAAT T T T C T AAGG a AC G AG AAGAA 
CTGAAGAGCCTAGTCACAGATAAAGAGTATTCGGATATGAAACAGTCCTC 
C C T GAC AGC C T AT T AC AC AG AC C CAT C C CT GAT C C GT C AGAT GT GGG AT A 
AGT TGG AAAG AG ATGGCTTT AC AGGTGGC AAAAT CCTAGATCCTTCCATG 
GGAACAGGGAATTT CTT TGCGG CT ATGC CAAAAC ACTT AAG AGAAAAGAG 
TGAGTTGTATGG CGT AGAGTT AGAT ACT ATTAC AGG AGCTATTGCCAAAC 
ACCTTCATCCCAATAGTCATATTGAAATTAAGGGATTTGAGACGGTGGCT 
TTTAACGACAATAGTTTTGATTTGGTGATTTCAAATGTGCCCTTTGCCAA 
TATACGAATTGCGGATAATAGGTACGATAGGCCTTACATGATTCATGACT 
ACTTTGTCAAAAAGTCACTTGATTTGCTTCATGATGGTGGACAAGTAGCG 
ATTATCTCTTCCACAGGAACTATGGATAAGCGAACAGAAAACATCTTACA 
AGATATTCGTGAGACAACTGAATTTCTTGGTGGGGTTCGACTGCCTGACT 
CTGCCTTTAAGGCCATTGCAGGAACGAGTGTCACAACGGATATGTTATTC 
T T C CAG AAAC AC T TAG AC AAGG GAT AT G T GG C AG ACGAT T T AG C C T T T T C 
AGGTTCCATTCGCTATGACAAGGATAGTCGCATTTGGCTCAATCCTTATT 
T T GAT G GAG AAT AC AAT AG C C AGGT G CT AGG AAC C T AC GAGGT C AGGAAT 
T T T AAC GGAG G AAC ACT T T C T GT T AAGGG G AC T AGT GAT G ACT T GAT T G C 
AAGTGTTGAAACAGCTCTAAATCACGTTAAGGCCCCAAGAGAGATTGATA 
GAAAT GAGGT CAT C AT T AAC C CAG AT GT GT T GAC C AAAC AAGT C AAT GAT 
ACCTCCATTCCAGCTGAAATGAGGGAAAATCTAGGTCAGTACAGTTTTGG 
TTATCAGGGGTCTACAGTTTACTATCGAGATAACAAAGGCATTCGAGTCG 
G AAC C AAG AC G GAAG AAAT CAGT T AC TAT GT C GAT GAAG AG 



SEQ ID. NO. 7004 
STRAIN H36B frame: 1 

GGKMNQEVLLQMMRATIPRDRALLEAFLYYQAEHFDEEWDSLIHQFMTNRQEINKSVQVL • 

HFETDVSAFVQASPYDTAHDLLTYTQVFGQSGLQKLDKLSPSEKNLVIEVALFNLATRFQ 

LLDSNGHYQTISPDSLLQKSRGANLVNVYRVANNLADRISRDIEQFLLTYEPELETRADE 

TVLENEETVDEHKTSVHQAISFREEGSLVIASLDVDLSQLDVQIGKTSHLPAYEELSLRR 

KFEILTYFDQIRNERSKVPSFRRGDFDTEMEMTPVFDGEELLTYLEADGSPYELKRTLTT 

VEEKELEKIGQAIRIENQEKLTQLXIXLSQFDPDRVGILLXAAGRXRLXNADLASLGGYP 

KASVTQLALATELLQMGLSHEKVEFFFGSQLSIEELRQVAYAFLHQELSREDAEQFEKDK 

GNQPDLTLRDWKSKLEPCAEGKEVVDEEFAENPLVQRVLDTYPLGSLVSYKGQDFEVMSVS 

DARLNGLIRIELVNDFSDIIEQNPVLYVRTWEEVSQALHQPKAEPQTELEEADQELNLFS 

FLEEELVQSIGLLEPDDSENGHNDTDLEETDNQIPEEEWETIPEIPVTDFYFPEDLTDF 

YPKTARDKVETNIVAIRLVKNLEVEHRNASPSEQELLAKYVGWGGLANEFFDDYNPKFSK 

EREELKSLVTDKEYSDMKQSSLTAYYTDPSLIRQMWDKLERDGFTGGKILDPSMGTGNFF 

AAMPKHLREKSELYGVELDTITGAIAKHLHPNSHIEIKGFETVAFNDNSFDLVISNVPFA 

NIRIADNRYDRPYMIHDYFVKKSLDLLHDGGQVAIISSTGTMDKRTENILQDIRETTEFL 

GGVRLPDSAFKAIAGTSVTTDMLFFQKHLDKGYVADDLAFSGSIRYDKDSRIWLNPYFDG 

EYNSQVLGTYEVRNFNGGTLSVKGTSDDLIASVETALNHVKAPREIDRNEVIINPDVLTK 



295 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



QVNDTSIPAEMRENLGQYSFGYQGSTVYYRDNKGIRVGTKTEEISYYVDEE 

SEQ ID. NO. 7005 

STRAIN 18RS21 frame: 1 

XGKMNQEVLLQMMRATIPRDRALLEAFLYYQAEHFDEEWDSLIHQFMTNRQEINKSVQVL 
HFETDVSAFVQAS PYDTAHDLLT YTQVFGQSGLQKLDKLS PSEKNLVIEVALFNLATRFQ 
LLDSNGHYQTISPDSLLQKSRGANLVNVYRVANNLADRISRDIEQFLLTYEPELETRADE 
TVLENEETVDEHKTSVHQAISFREEGSLVIASLDVDLSQLDVQIGKTSHLPAYEELSLRR 
KFEILTYFDQIRNERSKVPSFRRGDFDTEMEMTPVFDGEELLTYLEADGSPYELKRTLTT 
VEEKELEKIGQAIRIENQEKLTQLGIDLSQFDPDRVGILLDAAGRFRLKNADLALLGGYP 
KASVTQLALATELLQMGLSHEKVEFFFGSQLSIEELRQVAYAFLHQELSREDAEQFEKDK 
GNQPDLTLRDWKSKLEKAEGKEVVDEEFAENPLVQRVLDTYPLGSLVSYKGQDFEVMSVS 
DARLNGLIRIELVNDFSDIIEQNPVLYVRTWEEVSQALHQPKAEPQTELEEADQELNLFS 
FLEEEPVQSIGLLEPDDSENGHNDTDLEETDNQIPEEEWETIPEIPVTDFYFPEDLTDF 
YPKTARDKVETNIVAIRLVKNLEVEHRNASPSEQELLAKYVGWGGLANEFFDDYNPKFSK 
EREELKSLVTDKEYSDMKQSSLTAYYTDPSLIRQMWDKLERDGFTGGKILDPSMGTGNFF 
AAMPKHLREKSELYGVELDTITGAIAKHLHPNSHIEIKGFETVAFNDNSFDLVISNVPFA 
NIRIADNRYDRPYMIHDYFVKKSLDLLHDGGQVAIISSTGTMDKRTENILQDIRETTEFL 
GGVRLPDSAFKAIAGTSVTTDMLFFQKHLDKGYVADDLAFSGSIRYDKDSRIWLNPYFDG 
EYNSQVLGTYEVRNFNGGTLSVKGTSDDLIASVETALNHVKAPREIDRNEVIINPDVLTK 
QVNDTSIPAEMRENLGQYSFGYQGSTVYYRDNKGIRVGTKTEEISYYVDEE 

SEQ ID. NO. 7006 

STRAIN 2603 frame: 1 

ggkmnqevllqmmratiprdralleaflyyqaehfdeewdslihqfmtnrqeinksvqvl 
hfetdvsafvqas pydtahdlltytqvfgqsglqkldkls pseknlvievalfnlatrfq 
lldsnghyqtispdsllqksrganlvnvyrvannladrisrdieqflltyepeletra.de 
tvleneetvdehktsvhqaisfreegslviasldvdlsqldvqigktshlpayeelslrr 
kfeiltyfdqirnerskvpsfrrgdfdtememtpvfdgeelltyleadgspyelkrtltt 
veekelekigqairienqekltqlgidlsqfdpdrvgilldaagrfrlknadlallggyp 
kasvtqlalatellqmglshekvefffgsqlsieelrqvayaflyqelsredaeqfekdk 
gnqpdltlrdwksklekaegkewdeefaenplvqrvldtyplgslvsykgqdfevmsvs 
darlnglirielvndfsdiieqnpvlyvrtweevsqalhqpkaepqteleeadqelnlfs 
fleeepvqsigllepddsenghndtdleetdnqipeeewetipeipvtdfyfpedltdf 
ypktardkvetnivairlvknlevehrnaspseqellakyvgwgglaneffddynpkfsk 
ereelkslvtdkeysdmkqssltayytdpslirqmwdklerdgftggkildpsmgtgnff 
aampkhlrekselygveldt itgaiakhlhpnshie ikgfetvafndns fdlvi snvpfa 
niriadnrydrpymihdyfvkksldllhdggqvaiisstgtmdkrtenilqdirettefl 
ggvrlpdsafkaiagtsvttdmlffqkhldkgyvaddlafsgsirydkdsriwlnpyfdg 
eynsqvlgtyevrnfnggtlsvkgtsddliasvetalnhvkapreidrneviinpdvltk 
qvndt s i paemrenlgqys fgyqgstvyyrdnkgirvgtktee i s yyvdee 

SEQ ID NO. 7101 
STRAIN 2603 

ATGAAAAAGAAAATTATTTTGAAAAGTAGTGTTCTTGGTTTAGTCGCTGGGACTTCTATT 
ATGTTCTCAAGCGTGTTCGCGGACCAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTT 
CATGGTGCACTTGACAATACTGGAACAGCAAATATGCCTGATGGAAAAGTTGCTAATGCT 
GGT ACT GCT GCT C AAT TAG AT G C T TAT AT G GAT G AC G C T C AAAAAG AT T T C AAAC AAAC T 
AAC C C T AAT G GT G AAAG CAT T AGG GT T C AAG C AGG C GAT AT G GT T G GAG C AAGT C C AG C C 
AACTCTGGGCTTCTTCAAGATGAACCAACTGTCAAAAATTTTAATGCAATGAATGTTGAG 
TAT G G C AC AT T GGGT AAC CAT G AAT T T GAT G AAGGGT T G G C AG AAT AT AAT C GT AT C GT T 
ACTGGTAAAGCCCCTGCTCCAGATTCTAATATTAATAATATTACGAAATCATACCCACAT 
GAAGCTGCAAAACAAGAAATTGTAGTGGCAAATGTTATTGATAAAGTTAACAAACAAATT 
CCTTACAATTGGAAGCCTTACGCTATTAAAAATATTCCTGTAAATAACAAAAGTGTGAAC 
GTTGGCTTTATCGGGATTGTCACCAAAGACATCCCAAACCTTGTCTTACGTAAAAATTAT 
G AAC AAT AT GAAT T T T TAG AT GAAG C T GAAAC AAT C G T T AAAT AC G C C AAAG AAT T AC AA 
GCTAAAAATGTCAAAGCTATTGTAGTTCTCGCACATGTACCTGCAACAAGTAAAAATGAT 
AT T G CT G AAGGT GAAG C AG C AG AAAT GAT G AAAAAAGT C AAT C AAC T CT T C C CT G AAAAT 
AGCGTAGATATTGTCTTTGCTGGACACAATCATCAATATACAAATGGTCTTGTTGGTAAA 
ACTCGTATTGTACAAGCGCTCTCTCAAGGAAAAGCCTATGCTGATGTACGTGGTGTCTTA 
GAT AC T GAT AC AC AAG AT T T CAT T GAG AC C C C T T C AG C T AAAGT AAT T G C AGT T G C T C C T 
G G T AAAAAAAC AGGT AGT G C C GAT AT T C AAG C CAT T G T T G AC C AAGC T AAT AC TAT CGT T 
AAACAAGTAACAGAAGCTAAAATTGGTACTGCCGAGGTAAGTGTCATGATTACGCGTTCT 
GTTGATCAAGATAATGTTAGTCCGGTAGGCAGCCTCATCACAGAGGCTCAACTAGCAATT 
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GCTCGAAAAAGCTGGCCAGATATCGATTTTGCCATGACAAATAATGGTGGCATTCGTGCT 
GACTTACTCATCAAACCAGATGGAACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCT 
TTTGGTAATATCTTACAAGTCGTCGAAATTACTGGTAGAGATCTTTATAAAGCACTCAAC 
GAACAATACGACCAAAAACAAAATTTCTTCCTTCAAATAGCTGGTCTGCGATACACTTAC 
ACAGATAATAAAGAGGGCGGGGAAGAAACACCATTTAAAGTTGTAAAAGCTTATAAATCA 

AAT GGT GAG GAAAT C AAT C C T GAT G C AAAAT AC AAAT T AGT TAT C AAT G ACT T T T T AT T C 
GGTGGTGGTGATGGCTTTGCAAGCTTCAGAAATGCCAAACTTCTAGGAGCCATTAACCCC 
GATACAGAGGTATTTATGGCCTATATCACTGATTTAGAAAAAGCTGGTAAAAAAGTGAGC 
GTTCCAAATAATAAACCTAAAATCTATGTCACTATGAAGATGGTTAATGAAACTATTACA 
C AAAAT GAT G GT ACAC AT AGC AT T AT T AAG AAACT T T AT T TAG AT C G AC AAGGAAAT AT T 
GT AG C AC AAGAG AT T G T AT C AG AC ACT T TAAACCAAAC AAAAT C AAAAT C T AC AAAAAT C 
AAC C CT GT AACT AC AAT T C AC AAAAAAC AAT T AC AC C AAT T T AC AG C T AT T AAC CCT AT G 
AG AAAT TAT GG C AAAC CAT C AAACT C C ACT AC T GT AAAAT C AAAAC AAT T AC C AAAAAC A 
AACTCTGAATATGGACAATCATTCCTTATGTCTGTCTTTGGTGTTGGACTTATAGGAATT 

GCTTT AAAT ACAAAGAAAAAACAT AT GAAA 

SEQ ID NO. 7102 

STRAIN 0 90 

AAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTTGAC 
AAT ACT GGAAC AGC AAAT AT G C C T G AC GG AAAAGT T AC T AAT G CT GG C AC 
T G C T GC T C AAT TAG AT G CT T AT AT G GAT GAT GCT C AAAAAG AT T T C AAAC 
AAACTAACCCTAATGGTGAAAGCATTAGAGTTCAAGCTGGTGATATGGTT 
G GAG C AAGT C C AGC T AAC T C AGGG CT T CT T C AAG AT G AAC C AAC C GT T AA 
AACATTTAATGCAATGAATGTTGAGTATGGCACATTAGGTAACCATGAAT 
TTGATGAAGGTTTGGCAGAATACAATCGTATCGTTACTGGAAAGGCCCCT 
GCTCCAGATTCTAATATAAATAATATTACGAAATCATACCCACACGAAGC 

T G C AAAAC AAGAAAT TGT AGT GG C AAAC G T T AT T G AT AAAGT T AAC AAAC 
AAATCCCTTACAATTGGAAACCTTACGCTATTAAAAATATTCCTGTAAAT 

AAC AAAAGT GT G AAC GT T GGC T T T AT C GG AAT C G T T AC C AAAG AC AT C C C 
AAACCTTGTCTTACGTAAAAATTATGAACAATATGAATTTTTAGATGAAG 
CTGAAACAAT CGTT AAAT ACGC CAAAGAATT ACAAGCTAAAAAT GT C AAG 
GCTATTGTAGTCCTTGCTCATGTACCTGCAACAAGCAAGGATGATATTGC 
TGAAGGTGAAGCAGCAGAAATGATGAAAAAAGTCAATCAACTCTTCCCTG 
AAAAT AG CGT AG AT AT TGT CTTTGCTGG AC AC AAT CATC AAT AT AC AAAT 
GGTCTTGTTGGTAAAACTCGCATTGTACAAGCGCTCTCTCAAGGAAAAGC 
C TAT GCT G AC GT AC GT GGT GT C CT AG AT ACT GAT AC AC AAG AT T T CAT T G 
AAAC CCCTTCAGCTAAAGTAGTTGCAGTTGCT CCT GGT AAAAAAACAGGT 
AGTGCCGATATTCAAGCCATTGTTGACCAAGCTAATACTATCGTTAAACA 
AGT AAC AGAAGCT AAAAT TGGTACTGCCGAGGT AAGT GGC AT G ATT ACGC 
GTTCTGTTGATCAAGATAATGTTAGTCCAGTAGGCAGCCTCATCACAGAG 
GCTCAACTAGCAATTGCTCGAAAAAGCTGGCCAGATATCGATTTTGCCAT 
G AC AAAT AAT GGT GG C AT T CGT G CT G AC T T AC T CAT C AAAC C AGAT G G AA 
CAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAATATCTTA 
C AAG T CGT C GAAATT ACTGGT AG AG ATCT T T AT AAAGC ACT C AACGAAC A 
ATACGACCAAAAACAAAATTTCTTCCTTCAAATAGCTGGTCTGCGATACA 
CT T AC AC AG AT AAT AAAG AG G G C G GAG AAG AAAC AC CAT T T AAAGT T GT A 
AAAG C T TAT AAAT C AAAT GGT GAAG AAAT C AAT CCT GAT G C AAAAT AC AA 
ATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAAGCT 
T C AGAAAT G C C AAAC T T C T AG GAG C CAT T AAT C C C GAT AC AG AG GT AT T T 
AT GG C C TAT AT C ACT GAT T T AGAAAAAGC T GGT AAAAAAGT G AG CGT T C C 
AAAT AAT AAACCT AAAAT CTATGT C ACT AT GAAG ATGGTT AAT GAAACT A 
T T AC AC AAAAT GAT G G T AC AC AT AG CAT TAT T AAG AAAC T T TAT T TAG AT 
C G AC AAGGAAAT AT T GT AG C AC AAG AG AT T GT AT C AG AC AC T T T AAAC C A 
AAC AAAAT C AAAAT CT AC AAAAAT CAACCCTGT AACT AC AATTCACAAAA 
AAC AAT T ACAC C AAT T T AC AG C T AT T AACC C T AT GAG AAAT TAT G GC AAA 
C CAT C AAAC T C C ACT ACT G T AAAAT C AAAAC AA 

SEQ ID NO. 7103 

STRAIN A909 

GCGTCAATGACTTTCATGGTGCaCTTGACAATACTGGAACAGCAAATATG 
CCTGACGGAAAAGTTACTAATGCTGGCACTGCTGCTCAATTAGATGCTTA 
T AT GG AT GAT GCT C AAAAAG AT T T C AAAC AAAC T AAC C C T AAT GGT GAAA 
G CAT TAG AGT T C AAG C T G GT G AT AT G GT T GG AG C AAGT C C AG C T AAC T C A 
GGGCTTCTTCAAGATGAACCAACCGTTAAAACATTTAATGCAATGAATGT 
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T G AGT AT G G C AC AT TAG G T AAC CAT GAAT T T GAT GAAGGT T T G GC AGAAT 
ACAATCGTATCGTTACTGGAAAGGCCCCTGCTCCaGaTTCTAATATAAAT 
AATATTACGAAATCATACCCACACGAAGCTGCAAAACAAGAAATTGTAGT 
GGCAAACGTTATTGATAAAGTTAACAAACAAATCCCTTACAATTGGAAAC 
C T T AC AC TAT T AAAAAT AT T C CT GT AAAT AAC AAAAG T G TG AACGT T GG C 
T T TAT CG GAAT CGT T AC C AAAG AC AT C C C AAAC C T T GT C TT AC GT AAAAA 
T TAT GAAC AAT AT GAAT T T T TAG AT G AAG CT GAAAC AAT CGT T AAAT ACG 
CCAAAGAATTACAAGCTAAAAATGTCAAGGCTATTGTAGTCCTTGCTCAT 
GT AC C T G C AAC AAGC AAG GAT GAT AT T G C T GAAGGT GAAGC AG C AG AAAT 
GATGAAAAAAGTCAATCAACTCTTCCCTGAAAATAGCGTAGATATTGTCT 
T TG CT G GAC AC AAT CAT C AAT AT AC AAAT GGTCTTGTT GGT AAAAC T C GT 
ATTGTACAAGCGCTCTCTCAAGGAAAAGCCTATGCTGATGTACGTGGTGT 
CCTAGATACTGATACACAAGATTTCATTGAAACCCCTTCAGCTAAAGTAA 
TTGCAGTTGCTCCTGGTAAAAAAACAGGTAGTGCCGATATTCAAGCCATT 
GT T GAC C AAG C T AAT ACT AT C GT T AAAC AAG T AAC AG AAGC T AAAAT T GG 
T AC T G C C G AGGT AAGT GGC AT G AT T AC GCGTTCTGT TG ATC AAG AT AAT G 
T T AGT C C G GT AGG C AG C CT CAT C AC AGAG G CT C AACT AG C AAT T G C T C G A 
AAAAGC T GGC C AGAT AT C GAT T T T G C CAT GAC AAAT AAT GGT GG CAT T CG 
T G C T GAC T T AC T CAT C AAAC C AG AT GG AAC AAT C AC C T GGGG AGC T G C AC 
AAG C AGT T C AAC CT T T T GG T AAT AT CT T AC AAG T C GT C GAAAT TACT G GT 
AGAGAT CTT TAT AAAGCACT CAACGAAC AAT ACG AC C AAAAAC AAAAT TT 
CT T C C T T C AAAT AG C T GGT CT GCG AT AC AC T T AC AC AG AT AAT AAAG AGG 
GCGGGGAAGAAACACCATTTAAAGTTGTAAAAGCTTATAAATCAAATGGT 
G AGG AAAT C AAT C C T GAT G C AAAAT AC AAAT T AGT TAT C AAT GAC TT T T T 
ATTCGGTGGTGGTGATGGCTTTGCAAGCTTCAGAAATGCCAAACTTCTAG 
GAG C C AT T AAT C C C GAT AC AGAGGT AT T TAT G G C C TAT AT C AC T GAT T T A 
GAAAAAGCT GGT AAAAAAGTGAGCGTTCCAAAT AAT AAAC CT AAAAT CT A 
T GT C AC TAT G AAG AT GGT T AAT GAAACT AT T AC AC AAAAT GAT G GT AC AT 
AT AG CAT TAT T AAG AAAC T T TAT T T AGAT C G AC AAGG AAAT AT T GT AGC A 
C AAG AG AT T GT AT C AGAC ACT T T AAAC C AAAC AAAAT C AAAAT C T AC AAA 
AAT C AAC CC T GT AAC T AC AAT T C AC AAAAAAC AAT TAG AC C AAT T T AC AG 
CTATTAACCCTATGAGAAATTATGGCAAACCATCAAACTCCACTACTGTA 
AAAT C AAAAC AA 

SEQ ID NO. 7104 

STRAIN H3 6B 

CCAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTTG 
AC AAT ACT GG AAC AG C AAAT AT G CC T G ACG G AAAAGT T AC T AAT G C T G G C 
ACTGCTGCTCAATTAGATGCTTATATGGATGATGCTCAAAAAGATTTCAA 
AC AAACT AAC C C T AAT G GT GAAAG CAT TAG AGT T C AAG CT G G T GAT AT G G 
T T GGAG C AAGT C C AG CT AAC T C AG GG CT T CT T C AAG AT GAAC C AAC CGT T 
AAAAC AT T T AAT GC AAT GAAT GT TG AGT AT GGC AC ATT AGGT AAC C AT GA 
ATTTGATGAAGGTTTGGCAGAATACAATCGTATCGTTACTGGAAAGGCCC 
CT GCT C CAGAT T CT AAT AT AAAT AAT ATT ACGAAAT CAT ACCCACACGAA 
GCTGC AAAACAAGAAAT T GT AGT GGC AAACGT T AT T GAT AAAGT T AACAA 
AC AAAT CCCTT AC AATTGGAAACCTTACACT ATT AAAAAT ATT CCTGTAA 
AT AAC AAAAGT GT GAAC G T T GGCT T T AT CGGAAT C GT T AC C AAAG AC AT C 
C C AAAC CT T GT CT T AC G T AAAAAT TAT GAAC AAT AT GAAT T T T TAG AT G A 
AGCT GAAAC AAT CGT T AAAT AC G C C AAAG AAT T AC AAGC T AAAAAT GT C A 
AG G CT AT T G T AGT C C T T G CT C AT GT AC C T G C AAC AAG C AAG GAT GAT AT T 
GCT GAAGGT GAAG C AG C AG AAAT GAT G AAAAAAGT C AAT C AACT CT T C C C 
T G AAAAT AG C G TAG AT AT TGTCTTTGCTG GAC AC AAT CAT C AAT AT AC AA 
AT GGTCTTGTT GGT AAAACT CGT AT TGTACAAGCGCTCTCTCAAGGAAAA 
G C C TAT GCT GAT GT ACGT GGT G T C C TAG AT ACT GAT AC AC AAG AT T T CAT 
TGAAACCCCTTCAGCTAAAGTAATTGCAGTTGCTCCTGGTAAAAAAACAG 
GTAGTGCCGATATTCAAGCCATTGTTGACCAAGCTAATACTATCGTTAAA 
CAAGTAACAGAAGCTAAAATTGGTACTGCCGAGGTAAGTGGCATGATTAC 
GCGTTCTGTTGATCAAGATAATGTTAGTCCGGTAGGCAGCCTCATCACAG 
AGGCTC AACT AGC AAT T GCT CGAAAAAGCT GGC CAGAT AT CGATTTTGCC 
ATGACAAATAATGGTGGCATTCGTGCTGACTTACTCATCAAACCAGATGG 
AAC AAT C AC C T GG G GAG C T G C AC AAG C AGT T C AAC CT T T T G GT AAT AT C T 
T AC AAGT C GT C GAAAT T AC T GGT AG AGAT C T T TAT AAAGCACT C AAC G AA 
CAATACGACC AAAAAC AAAATTTCTTCCTTC AAAT AGCTGGTCTGCGAT A 
C ACT T AC AC AG AT AAT AAAG AG GGC GGGG AAG AAAC AC C AT TT AAAGT TG 
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TAAAAGCTTATAAATCAAATGGTGAGGAAATCAATCCTGATGCAAAATAC 
AAATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAAG 
CTTCAGAAATGCCAAACTTCTAGGAGCCATTAATCCCGATACAGAGGTAT 
TTATGGCCTATATCACTGATTTAGAAAAAGCTGGTAAAAAAGTGAGCGTT 
CCAAATAATAAACCTAAAATCTATGTCACTATGAAGATGGTTAATGAAAC 
TATTACACAAAATGATGGTACATATAGCATTATTAAGAAACTTTATTTAG 
AT C G AC AAGG AAAT AT T GT AG C AC AAG AG AT T GT AT C AG AC ACT T T AAAC 
C AAAC AAAAT C AAAAT CT AC AAAAAT C AACCCTGT AACT AC AAT T C AC AA 
AAAAC AAT T AC AC C AAT T T AC AG CT AT T AAC C C T AT GAG AAAT TAT G G C A 
AAC CAT CAAACTCCACTACTGT AAAAT CAAA 

SEQ ID NO. 7105 

STRAIN 18RS21 

G AC C AAGT C GGT GT C C AAGT TAT AG G C GT C AAT G ACT T T C 

AT GG T GC AC T T G AC AAT ACT G G AAC AGC AAAT AT G C C T G AC G G AAAAGT T 

AnTAATGCTGGCACTGCTGCTCAATTAGATGCTTATATGGATGATGCTCA 

AAAAG AT T T C AAAC AAAC T AAC C CT AAT GGT G AAAG CAT T AGAGT T C AAG 

CTGGTGATATGGTTGGAGCAAGTCCAGCTAACTCAGGGCTTCTTCAAGAT 

G AAC C AAC C GT T AAAAC AT T T AAT G CAAT G AAT GT T G AGT AT GGC AC AT T 

AGGTAACCATGAATTTGATGAAGGTTTGGCAGAATACAATCGTATCGTTA 

C T G G AAAGG C CC CT GC T CCAGAT T C T AAT AT AAAT AAT AT T ACGAAAT C A 

T AC C C AC AC G AAGC T G C AAAAC AAG AAAT T GT AGT G G C AAAC G T T AT T G A 

T AAAGT T AAC AAAC AAAT C C CT T AC AAT T G G AAAC CT T AC ACT AT T AAAA 

ATATTCCTGTAAATAACAAAAGTGTGAACGTTGGCTTTATCGGAATCGTT 

AC C AAAG AC AT C C C AAAC CT T GT CT T AC GT AAAAAT TAT G AAC AAT AT GA 

AT T T T T AGAT G AAG CT G AAAC AAT C G T T AAAT AC G C C AAAG AAT T AC AAG 

CTAAAAATGTCAAGGCTATTGTAGTCCTTGCTCATGTACCTGCAACAAGC 

AAG GAT GAT AT T G C T G AAGGT GAAG C AG C AG AAAT GAT G AAAAAAG T C AA 

TCAACTCTTCCCTGAAAATAGCGTAGATATTGTCTTTGCTGGACACAATC 

AT CAAT AT AC AAAT GGTCTTGTT GG T AAAAC T CG T AT T G T AC AAG C G CT C 

TCTCAAGGAAAAGCCTATGCTGATGTACGTGGTGTCCTAGATACTGATAC 

ACAAGATTTCATTGAAACCCCTTCAGCTAAAGTAATTGCAGTTGCTCCTG 

GTAAAAAAACAGGTAGTGCCGATATTCAAGCCATTGTTGACCAAGCTAAT 

ACTATCGTTAAACAAGTAACAGAAGCTAAAATTGGTACTGCCGAGGTAAG 

TGGCATGATTACGCGTTCTGTTGATCAAGATAATGTTAGTCCGGTAGGCA 

G C C T CAT C AC AG AGG C T C AAC TAG CAAT T GC T C G AAAAAG C T G G C C AG AT 

ATCGATTTTGCCATGACAAATAATGGTGGCATTCGTGCTGACTTACTCAT 

C AAAC C AG AT GG AAC AAT C AC CT GGGGAG C T G C AC AAG C AGT T C AAC C T T 

TTGGTAATATCTTACAAGTCGTCGAAATTACTGGTAGAGATCTTTATAAA 

G C ACT C AAC G AAC AAT AC G ACC AAAAAC AAAAT TTCTTCCTT C AAAT AG C 

TGGTCTGCGATACACTTACACAGATAATAAAGAGGGCGGGGAAGAAACAC 

CAT T T AAAGT T GT AAAAG C T TAT AAAT C AAAT G GT G AG G AAAT CAAT C C T 

GATGCAAAATACAAATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGA 

TGGCTTTGCAAGCTTCAGAAATGCCAAACTTCTAGGAGCCATTAATCCCG 

ATACAGAGGTATTTATGGCCTATATCACTGATTTAGAAAAAGCTGGTAAA 

AAAGT GAG C GT T C C AAAT AAT AAAC CT AAAAT CT AT G T C AC TAT G AAGAT 

GGT T AAT G AAAC TAT T AC AC AAAAT GAT G GT AC AT AT AG CAT TAT T AAG A 

AAC T T T AT T TAG AT C G AC AAGG AAAT AT T GT AG C AC AAGAGAT T GT AT C A 

G AC ACT T T AAAC C AAAC AAAAT C AAAAT C T AC AAAAAT C AAC C CT G T AAC 

T AC AAT T C AC AAAAAAC AAT T AC AC CAAT T T AC AG C T AT T AAC C CT AT G A 

G AAAT T AT GG C AAAC C AT CAAACTCC ACT ACT GT AAAAT C AAAA 

SEQ ID NO. 7106 

STRAIN M732 

ACCAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTT 
G AC AAT AC T GG AAC AG C AAAT AT G C C T G AC G G AAAAGT T AC T AAT G C T G G 
CACTGCTGCT CAAT TAG AT G C T TAT AT G GAT GAT GC T C AAAAAG AT T T C A 
AAC AAAC T AAC C C T AAT G GT G AAAG CAT T AG AG T T C AAG CT GGT GAT AT G 
GT T GG AGC AAG T CC AG CT AAC T C AGGG CT T CT T C AAG AT GAAC C AAC C GT 
T AAAAC AT T T AAT GC AAT G AAT G T T G AGT AT GG C AC AT T AGGT AAC CAT G 
AAT T T GAT G AAGGT T T GG C AG AAT AC AAT C G T AT CGT TACT G G AAAG G C C 
C C T G C T C C AG AT T C T AAT AT AAAT AAT AT T AC G AAAT CAT AC C C AC AC G A 
AGCTGCAAAACAAGAAATTGTAGTGGCAAACGTTATTGATAAAGTTAACA 
AACAAATCCCTTACAATTGGAAACCTTACACTATTAAAAATATTCCTGTA 
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AAT AAC AAAAGT G T G AAC G T T G G C T T TAT CGG AAT CGT T AC C AAAG AC AT 
C C C AAAC CT T GT C T TAG GT AAAAAT TAT G AAC AAT AT G AAT T T T TAG AT G 
AAGCT GAAACAAT CGT T AAATACG C C AAAGAAT TAC AAGCT AAAAAT GT C 
AAGGCTATTGTAGTCCTTGCTCATGTACCTGCAACAAGCAAGGATGATAT 
T G CT G AAGGT G AAG C AG C AGAAAT GAT G AAAAAAGT C AAT C AAC T CT T C C 
CT GAAAAT AG C GT AG AT AT TGTCTTTGCTG G AC AC AAT CAT C AAT AT AC A 
AATGGTCTTGTTGGTAAAACTCGTATTGTACAAGCGCTCTCTCAAGGAAA 
AG C CT AT G CT G AT G T AC GT G G T GT C C T AGAT AC T GAT AC AC AAG AT T T C A 
TTGAAACCCCTTCAGCTAAAGTAATTGCAGTTGCTCCTGGTAAAAAAACA 
G GT AGT G C CG AT AT T C AAG C CAT T GT T G AC C AAG C T AAT ACT AT CGT T AA 
AC AAGT AAC AG AAG CT AAAAT T G GT AC T G C C G AGGT AAG T GG C AT G AT T A 
CGCGTTCTGTTGATCAAGATAATGTTAGTCCGGTAGGCAGCCTCATCACA 
GAG G CT C AACT AG C AAT T G C T C G AAAAAG C T G G C C AGAT AT CGAT T T T GC 
CAT GACAAATAATGGTGGC ATT CGTGCTGACT TACT CAT CAAACCAGATG 
GAACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAATATC 
TTACAAGTCGTCGAAATTACTGGTAGAGATCTTTATAAAGCACTCAACGA 
AC AAT AC GAC CAAAAAC AAAAT TTCTTCCTT C AAAT AGC T GGTC T G C GAT 
AC ACT T ACAC AGAT AAT AAAGAGGGCGGGGAAGAAACACCATT T AAAGT T 
GT AAAAG C T TAT AAAT C AAAT GG T GAG G AAAT C AAT C CT G AT GC AAAAT A 
CAAATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAA 
G CT T C AG AAAT GC C AAACT T CT AGG AG C C AT T AAT C C CG AT AC AGAG GT A 
T T T AT G G CCT AT AT C ACT GAT T TAG AAAAAG C T G GT AAAAAAGT GAG CAT 
T C C AAAT AAT AAACCT AAAAT CT AT GT C ACT AT GAAGATGGT T AAT G AAA 
C TAT TAC AC AAAAT G AT GGT AC AT AT AG CAT TAT T AAGAAAC T T TAT T T A 
GAT CG ACAAGGAAAT ATT GT AGC ACAAGAGATTGT AT CAG ACACTT TAAA 
C C AAAC AAAAT C AAAAT C TAC AAAAAT C AAC CCT GT AAC TAC AAT T C AC A 
AAAAAC AAT TACACCAAT TT ACAGCT AT T AAC CCT AT GAGAAAT T AT GGC 
AAACC AT C AAACT C C ACT ACT GT AAAAT C AAAAC AA 

SEQ ID NO. 7107 

STRAIN COH1 

ACCAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTT 
GAC AAT ACT G G AAC AG C AAAT AT G C CT G ACG G AAAAG T TAC T AAT G CT G G 
CACTGCTGCTCAATTAGATGCTTATATGGATGATGCTCAAAAAGATTTCA 
AACAAACT AACCCT AATGGTGAAAGCATT AG AGT TCAAGCT GGT GATAT G 
GTTGGAGCAAGTCCAGCTAACTCAGGGCTTCTTCAAGATGAACCAACCGT 
T AAAACAT T TAATGC AAT GAATGT TGAGTAT GGCACATT AGGT AAC CAT G 
AAT T T GAT G AAG GT T T G G CAG AAT AC AAT C G TAT C GT T AC T GG AAAG GC C 
CCT GCT C C AGAT T CT AATAT AAAT AAT AT TACGAAAT CAT AC CC AC ACGA 
AG CT G C AAAAC AAGAAAT T GT AGT G G C AAAC G T TAT T GAT AAAGT T AAC A 
AACAAAT C C CT T AC AATT GGAAACCTT ACACT AT T AAAAATATT CCT GT A 
AAT AAC AAAAGT GT G AAC GT T G G C T T TAT C GG AAT C GT T AC C AAAG AC AT 
C C C AAAC C T T G t C TT AC G T AAAAAT TAT G AAC AAT AT GAAT T T T TAG AT G 
AAG CT GAAACAAT CGT T AAAT AC GC C AAAG AAT TAC AAG C T AAAAAT GT C 
AAGG C TAT T G T AGT C C T T GCT C AT GT AC CT G C AAC AAGC AAG GAT GATAT 
T G C T G AAG GT G AAG CAG CAG AAAT GAT G AAAAAAG T C AAT C AAC T C T T C C 
CTGAAAATAGCGTAGATATTGTCTTTGCTGGACACAATCATCAATATACA 
AATGGTCTTGTTGGTAAAACTCGTATTGTACAAGCGCTCTCTCAAGGAAA 
AGCCTATGCTGATGTACGTGGTGTCCTAGATACTGATACACAAGATTTCA 
TTGAAACCCCTTCAGCTAAAGTAATTGCAGTTGCTCCTGGTAAAAAAACA 
GGTAGTGCCGATATTCAAGCCATTGtTGACCAAGCTAATACTATCGTTAA 
ACAAGTAACAGAAGCTAAAATTGGTACTGCCGAGGTAAGTGGCATGATTA 
CGCGTTCTGTTGATCAAGATAATGTTAGTCCGGTAGGCAGCCTCATCACA 
GAG GCT C AAC TAG C AAT T G C T CG AAAAAG CT GGC CAG AT AT C GAT T T T G C 
CAT GAC AAAT AAT GGT G G CAT T C G T G C T G AC T T AC T CAT C AAAC CAG AT G 
GAACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAATATC 
TTACAAGTCGTCGAAATTACTGGTAGAGATCTTTATAAAGCACTCAACGA 
AC AAT AC GAC CAAAAAC AAAAT TTCTTCCTT C AAAT AG CT G GT C T G C G AT 
ACAC T TAC AC AGAT AAT AAAG AGG G C GGG G AAG AAAC AC C AT T T AAAGT T 
G T AAAAG CT T AT AAAT C AAAT G GT G AGG AAAT C AAT CCTG AT GC AAAAT A 
CAAATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAA 
G CT T CAG AAAT G C C AAAC T T CT AG GAG C C AT T AAT C C C GAT AC AG AGGT A 
T T T AT GG C C TAT AT C AC T GAT T TAG AAAAAG C T G GT AAAAAAG T GAG CAT 
T C C AAAT AAT AAAC C T AAAAT C T AT G T C AC TAT G AAG AT GGT T AAT G AAA 
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CTATTACACAAAATGATGGTACATATAGCATTATTAAGAAACTTTATTTA 
GAT C G AC AAG G AAAT AT T GT AG C AC AAG AG AT T GT AT C AGAC ACT T T AAA 
CCAAACAAAAT CAAAAT CTACAAAAATCAACC CTGT AACT ACAATT C ACA 
AAAAAC AAT T AC AC C AAT T T AC AG C T AT T AAC C C TAT GAG AAAT T AT GGC 
AAACCAT CAAACT CCACTACTGT AAAATC AAA 

SEQ ID NO. 7108 

STRAIN M781 

CAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTTGA 
CAATACTGGAACAGCAAATATGCCTGACGGAAAAGTTACTAATGCTGGCA 
CTGCTGCTCAATTAGATGCTTATATGGATGATGCTCAAAAAGATTTCAAA 
CAAACTAACCCTAATGGTGAAAGCATTAGAGTTCAAGCTGGTGATATGGT 
TGGAGCAAGTCCAGCTAACTCAGGGCTTCTTCAAGATGAACCAACCGTTA 
AAAC AT T T AAT G C AAT GAAT GT TGAGTATGGCACATT AGGTAAC CAT GAA 
T T T GAT G AAG G T T T G G C AG AAT AC AAT CGT AT CGT TACT GGAAAGGC C C C 
TGCT C C AGAT T CTAAT AT AAAT AAT ATT ACGAAAT CAT ACC CACACGAAG 
CT G CAAAAC AAG AAAT T GT AGT GG C AAAC G T T AT T GAT AAAGT T AAC AAA 
CAAATCCCTTACAATTGGAAACCTTACACTATTAAAAATATTCCTGTAAA 
TAACAAAAGTGTGAACGTTGGCTTTATCGGAATCGTTACCAAAGACATCC 
C AAAC CT T GT CT T ACGT AAAAAT TAT GAAC AAT AT GAAT T TT T AGAT GAA 
G C T G AAAC AAT CGT T AAAT ACG C C AAAG AAT T AC AAG CT AAAAAT GT C AA 
G GC T AT T GT AGT C CT T G C T C AT GT AC CT G C AAC AAG C AAGG AT GAT AT T G 
C T G AAG GT G AAG C AG C AG AAAT GAT G AAAAAAGT C AA t C AACT C T T C C CT 
GAAAATAGCGTAGATATTGTCTTTGCTGGACACAATCATCAATATACAAA 
TGGTCTTGTTGGTAAAACTCGTATTGTACAAGCGCTCTCTCAAGGAAAAG 
C C TAT G C T G AT GT AC GT G G T GT C C TAG AT AC T GAT AC AC AAG AT T T CAT T 
G AAAC C C C T T C AG CT AAAGT AAT T G C AGT T G C T C CT GGT AAAAAAAC AGG 
TAGTGCCGATATTCAAGCCATTGtTGACCAAGCTAATACTATCGTTAAAC 
AAGT AAC AG AAG CT AAAAT T GGT ACT G C C G AG GT AAG T G GC AT G AT T AC G 
CGTTCTGTT GAT C AAG AT AAT GT T AGT C C GGT AG G C AG C CT CAT C AC AG A 
GGCTCAACTAGCAATTGCTCGAAAAAGCTGGCCAGATATCGATTTTGCCA 
TGACAAATAATGGTGGCATTCGTGCTGACTTACTCATCAAACCAGATGGA 
ACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAATATCTT 
AC AAGT CGT CG AAAT T AC T GGT AG AG AT C T T TAT AAAG C ACT C AAC GAAC 
AATACGACCAAAAACAAAATTTCTTCCTTCAAATAGCTGGTCTGCGATAC 
ACTTACACAGATAATAAAGAGGGCGGGGAAGAAACACCATTTAAAGTTGT 
AAAAGCTTATAAATCAAATGGTGAGGAAATCAATCCTGATGCAAAATACA 
AATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAAGC 
TTCAGAAATGCCAAACTTCTAGGAGCCATTAATCCCGATACAGAgGTATT 
TAT GGC C TAT AT C AC T GAT T TAGAAAAAGC T GGT AAAAAAGT GAG CAT T C 
CAAATAATAAACCTAAAATCTATGTCACTATGAAGATGGTTAATGAAACT 
AT T AC AC AAAAT GAT G GT AC AT AT AG CAT T AT T AAG AAACT T T AT T T AG A 
T CGAC AAGG AAAT AT T GT AGCACAAGAGATT GT AT C AGACACTTT AAACC 
AAAC AAAAT CAAAAT CT ACAAAAATCAACCCTGT AACT ACAATT CAC AAA 
AAAC AAT T AC AC C AAT T T AC AG C T AT T AAC C C T AT GAG AAAT TAT GG C AA 
AC CAT C AAAC T C CAC T AC T GT AAAAT C AAA 

SEQ ID NO. 7109 

STRAIN CJB110 

GACCAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGC 
AC T T GAC AAT AC T G GAAC AG C AAAT AT GC CT G AC G G AAAAGT T AC T AAT G 
CTGGCACTGCTGCTCAATTAGATGCTTATATGGATGATGCTCAAAAAGAT 
TTCAAACAAACTAACCCTAATGGTGAAAGCATTAGAGTTCAAGCTGGTGA 
TAT GGT TGGAGCAAGTCCAGCT AACT CAGGGCTTCTTCAAGAT GAAC CAA 
C C G T T AAAAC AT T T AAT G C AAT GAAT G T T GAG TAT G G CAC AT T AG GT AAC 
CAT GAAT T T GAT G AAG GT T T G G C AG AAT AC AAT C GT AT CGTTACTG G AAA 
GGCCCCTGCTC C AG AT T c T AAT AT AAAT AAT AT T AC G AAAT CAT AC C CAC 
ACGAAGCTGCAAAACAAGAAATTGTAGTGGCAAACGTTATTGATAAAGTT 
AACAAACAAATCCCTTACAATTGGAAACCTTACGCTATTAAAAATATTCC 
TGTAAATAACAAAAGTGTGAACGTTGGCTTTATCGGAATCGTTACCAAAG 
AC AT C C C AAAC CT T GT C T T AC G T AAAAAT TAT GAAC AAT AT GAAT T T T T A 
GAT G AAG C T G AAAC AAT CGT T AAAT AC G C C AAAG AAT T AC AAG C T AAAAA 
T GT C AAGGCT ATTGT AGT CCT TGCT CATGT AC CTGC AAC AAGC AAGG ATG 
AT AT TGCT G AAG GT G AAG C AG C AG AAAT GAT GAAAAAAG T C AAT C AACT C 
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T T C C CT G AAAAT AG CGT AGAT AT TGTCTTTGCTG GAC AC AAT CAT C AAT A 
TACAAATGGTCTTGTTGGTAAAACTCGCATTGTACAAGCGCTCTCTCAAG 
GAAAAGCCTATGCTGACGTACGTGGTGTCCTAGATACTGATACACAAGAT 
TTCATTGAAACCCCTTCAGCTAAAGTAGTTGCAGTTGCTCCTGGTAAAAA 
AACAGGTAGTGCCGATATTCAAGCCATTGTTGACCAAGCTAATACTATCG 
T T AAACAAGTAAC AGAAG CT AAAAT T G GT ACT G CC G AGGT AAGT GG C AT G 
AT T AC GCGTTCTGTT GAT C AAGAT AAT GT T AGT C C AGT AGG C AG CC T CAT 
C AC AGAG GC T C AAC T AG C AAT T G C T C GAAAAAG CT GG C C AG AT AT C GAT T 
T T GC CAT G AC AAAT AAT G GT GG C AT T C GT G CT G ACT TACT CAT C AAAC C A 
GATGGAACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAA 
TAT C T T AC AAGT C GT CG AAAT TAG T GGT AG AG AT C T T T AT AAAG C AC T C A 
ACG AAC AAT ACG AC CAAAAAC AAAAT TTCTTCCTT C AAAT AG CT G GT C T G 
CGAT ACACTT ACACAGAT AAT AAAGAGGGCGGAGAAGAAAC AC CAT T T AA 
AGTTGTAAAAGCT TAT AAAT C AAAT GGT GAAGAAATC AAT CCTGATGCAA 
AATACAAATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTT 
G C AAGC T T C AG AAAT GC C AAACT T CT AGG AG C CAT T AAT C C C GAT AC AG A 
G GT AT T TAT G G C C TAT AT C AC T GAT T T AGAAAAAGC T G G T AAAAAAGT G A 
G C G T T C C AAAT AAT AAAC C T AAAAT C T AT GT C AC TAT G AAGAT GGT T AAT 
G AAACT AT TAG AC AAAAT GAT G GT AC AC AT AGC AT T AT T AAGAAAC T T T A 
T T T AGAT C GAC AAGGAAAT AT T GT AG C AC AAGAGAT T GT AT C AG AC ACT T 
T AAAC C AAAC AAAAT C AAAAT C T AC AAAAAT C AAC C CT GT AACT AC AATT 
CACAAAAAACAATTACACCAATTTACAGCTATTAACCCTATGAGAAATTA 
TGGCAAACC AT CAAACTCCACTACTGT AAAAT CA 

SEQ ID NO. 7110 

STRAIN 1169NT 

CAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTTGA 
C AAT AC T G G AAC AG C AAAT AT G C CT G AT GG AAAAGT T G C T AAT G C T GG T A 
CTGCTGCT C AAT T AGAT G C T TAT AT GG AT GAC GC T C AAAAAG AT T T C AAA 
C AAAC T AAC C C T AAT GGT G AAAG C AT T AGG GT T C AAG C AG G C GAT AT GGT 
T G GAG C AAGT C C AG C C AACT CTGGGCTTCTT C AAGAT G AAC C AACT GT C A 
AAAAT T T T AAT G C AAT G AAT GT T G AGT AT GGC AC AT T G G GT AAC CAT GAA 
TTTGATGAAGGGTTGGCAGAATATAATCGTATCGTTACTGGTAAAGCCCC 
TGCTCCAGATTCTAATATTAATAATATTACGAAATCATACCCACATGAAG 
CTGCAAAACAAGAAATTGTAGTGGCAAATGTTATTGATAAAGTTAACAAA 
CAAATTCCTTACAATTGGAAGCCTTACGCTATTAAAAATATTCCTGTAAA 
TAACAAAAGTGTGAACGTTGGCTTTATCGGGATTGTCACCAAAGACATCC 
C AAAC CT T G T C T T AC GT AAAAAT T AT GAAC AAT AT G AAT T T T TAG AT GAA 
G C T G AAAC AAT CGT T AAAT AC G C C AAAGAAT T AC AAG C T AAAAAT G T C AA 
AG CT AT T GT AG t T CT C G C AC AT GT AC CT GC AAC AAGT AAAAAT GAT AT T G 
C T GAAGGT G AAGC AG C AG AAAT GAT G AAAAAAG T C AAT C AAC TCTTCCCT 
G AAAAT AG CGT AG AT AT TGTCTTTGCTG GAC AC AAT CAT C AAT AT AC AAA 
TGGTCTTGTTGGTAAAACTCGTATTGTACAAGCGCTCTCTCAAGGAAAAG 
C C T AT G CT GAT GT AC GTGGTGTCT T AGAT ACT GAT AC AC AAGAT T T CAT T 

GAGACCCCTTCAGCTAAAGTAATTGCAGTTGCTCCTGGTAAAAAAACAGG 
T AGT G C C GAT AT T C AAG C CAT T GT T GAC C AAG C T AAT AC TAT C GT T AAAC 
AAG T AAC AGAAGC T AAAAT TGGTACTGCC G AGGT AAG T G T CAT G AT T AC G 
CGTTCTGTT GAT C AAG AT AAT GT T AGT C C GG T AGG C AG C C T CAT C AC AG A 
GGCTCAACTAGCAATTGCTCGAAAAAGCTGGCCAGATATCGATTTTGCCA 
T GAC AAAT AAT GGT GG CAT T C GT G C T G AC T T AC T CAT C AAAC C AG AT GG A 

ACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAATATCTT 
AC AAG T CGT C G AAAT T AC T GGT AG AG AT C T T TAT AAAG C AC T C AAC GAAC 

AATACGACCAAAAACAAAATTTCTTCCTTCAAATAGCTGGTCTGCGATAC 
AC T T AC AC AG AT AAT AAAGAG GG CG GG G AAG AAAC AC CAT T T AAAGT T G T 
AAAAGCT TAT AAAT C AAAT G GT G AG G AAAT C AAT C C T GAT G C AAAAT AC A 

AATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAAGC 
T T C AGAAAT G C C AAACT T C T AGG AG C CAT T AAC C C C GAT AC AG AG G T AT T 

TATGGCCTATATCACTGATTTAGAAAAAGCTGGTAAAAAAGTGAGCGTTC 
C AAAT AAT AAACC T AAAAT C T AT GT C AC T AT G AAG AT GGT T AAT G AAAC T 
AT T AC AC AAAAT G AT GG TAG AC AT AG CAT TAT T AAG AAAC T T TAT T TAG A 
T C GAC AAGGAAAT AT T G TAG C AC AAG AG AT T GT AT C AG AC AC T T T AAAC C 

AAACAAAATCAAAATCTACAAAAATCAACCCTGTAACTACAATTCACAAA 
AAAC AAT T AC AC C AAT T T AC AGC TAT T AAC C C T AT GAG AAAT TAT G G C AA 
AC C AT C AAACT CC ACT AC TGT AAAAT CAAA 
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SEQ ID NO. 7111 

STRAIN JM9130013 

CGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTTGACAATA 
CTGGAACAGCAAATATGCCTGACGGAAAAGTTACTAATGCTGGCACTGCT 
G CT C AAT T AGAT GCT T AT AT GGAT GAT G C T C AAAAAG AT T TC AAAC AAAC 
TAACCCTAATGGTGAAAGCATTAGAGTTCAAGCTGGTGATATGGTTGGAG 
C AAGT C C AG CT AAC T C AG GGCTTCTT C AAG AT G AAC C AAC CGT T AAAAC A 
T T T AAT G C AAT G AAT G T T GAG T AT G G C AC AT TAG G T AAC CAT G AAT T T G A 
TGAAGGTTTGGCAGAATACAATCGTATCGTTACTGGAAAGGCCCCTGCTC 
CAGATTcTAATATAAATAATATTACGAAATCATACCCACACGAAGCTGCA 
AAACAAGAAATTGTAGTGGCAAACGTTATTGATAAAGTTAACAAACAAAT 
CCCTTACAATTGGAAACCTTACACTATTAAAAATATTCCTGTAAATAACA 
AAAGTGTGAACGTTGGCTTTATCGGAATCGTTACCAAAGACATCCCAAAC 
C TT GT C TT AC G T AAAAAT T AT G AAC AAT AT G AAT T T T T AGAT G AAG C T G A 
AAC AAT CGT T AAAT AC G C C AAAG AAT T AC AAG C T AAAAAT GT C AAG G C T A 
T T GT AG TCCTTGCT CAT GT AC CT G C AAC AAG C AAGG AT GAT AT T G C T G AA 
GGT GAAG C AG C AGAAAT GAT G AAAAAAG T C AAT C AAC TCTTCCCT G AAAA 
TAG C GT AG AT AT TGTCTTTGCTG G AC AC AAT CAT C AAT AT AC AAAT GGT C 
TTGTTGGTAAAACTCGTATTGTACAAGCGCTCTCTCAAGGAAAAGCCTAT 
GCTGATGTACGTGGTGTCCTAGATACTGATACACAAGATTTCATTGAAAC 
CCCTTCAGCTAAAGTAATTGCAGTTGCTCCTGGTAAAAAAACAGGTAGTG 
C C GAT AT T C AAG C CAT T GT T G AC C AAG C T AAT ACT AT C GT T AAAC AAGT A 
ACAGAAGCTAAAATTGGTACTGCCGAGGTAAGTGGCATGATTACGCGTTC 
T GT T GAT C AAG AT AAT GT T AGT C CGGT AG GC AG C CT CAT C AC AG AGG CT C 
AAC TAG C AAT T G C T C G AAAAAG C T GG C C AG AT AT C GAT T T TG C CAT G AC A 
AAT AAT GGT G G C AT T C GT G C T G AC T T AC T CAT C AAAC C AG AT G G AAC AAT 
C AC C T GGGG AG C T G C AC AAG C AGT T C AAC C T T T T GGT AAT AT C T T AC AAG 
T CGT C G AAAT T AC T G G TAG AG AT CT T TAT AAAG C ACT C AACG AAC AAT AC 
GACCAAAAACAAAATTTCTTCCTTCAAATAGCTGGTCTGCGATACACTTA 
C AC AG AT AAT AAAG AG G G C G G G GAAG AAAC AC CAT T T AAAGT T G T AAAAG 
C T TAT AAAT C AAAT GGT GAG G AAAT C AAT C CT GAT G C AAAAT AC AAAT T A 
GTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAAGCTTCAG 
AAATGCCAAACTTCTAGGAGCCATTAATCCCGATACAGAGGTATTTATGG 
C CT AT AT C ACT GAT T TAG AAAAAG C T G GT AAAAAAGT GAG CGT T C C AAAT 
AATAAACCTAAAATCTATGTCACTATGAAGATGGTTAATGAAACTATTAC 
ACAAAATGATGGTACATATAGCATTATTGAGAAACTTTATTTAGATCGAC 
AAG GAAAT AT T GT AG C AC AAG AG AT T GT AT C AG AC AC T T T AAAC C AAAC A 
AAAT C AAAAT C T AC AAAAAT C AAC C C T G T AAC T AC AAT T C AC AAAAAAC A 
AT T AC AC C AAT T T AC AG C TAT T AAC C C T AT GAG AAAT TAT GG C AAAC CAT 
C AAAC T C C ACT ACT G T AAAAT C AAAA 



SEQ ID NO. 7112 

STRAIN 2 603 frame: 1 

MKKKIILKSSVLGLVAGTSIMFSSVFADQVGVQVIGVNDFHGALDNTGTANMPDGKVANA 
GTAAQLDAYMDDAQKDFKQTNPNGESIRVQAGDMVGASPANSGLLQDEPTVKNFNAMNVE 
YGTLGNHEFDEGLAEYNRIVTGKAPAPDSNINNITKSYPHEAAKQEIVVANVIDKVNKQI 
PYNWKPYAIKNIPVNNKSVNVGFIGIVTKDIPNLVLRKNYEQYEFLDEAETIVKYAKELQ 
AKNVKAIWLAHVPATSKNDIAEGEAAEMMKKVNQLFPENSVDIVFAGHNHQYTNGLVGK 
TRIVQALSQGKAYADVRGVLDTDTQDFIETPSAKVIAVAPGKKTGSADIQAIVDQANTIV 
KQVTEAKIGTAEVSVMITRSVDQDNVSPVGSLITEAQLAIARKSWPDIDFAMTNNGGIRA 
DLLIKPDGTITWGAAQAVQPFGNILQVVEITGRDLYKALNEQYDQKQNFFLQIAGLRYTY 
TDNKEGGEETPFKVVKAYKSNGEEINPDAKYKLVINDFLFGGGDGFASFRNAKLLGAINP 
DTEVFMAYITDLEKAGKKVSVPNNKPKIYVTMKMVNETITQNDGTHSIIKKLYLDRQGNI 
VAQE I VS DT LNQTKSKS TKI N P VTT IHKKQLHQFT AIN PMRN YGKP SNSTT VKS KQL PKT 
• NSEYGQSFLMSVFGVGLIGIALNTKKKHMK 



SEQ ID NO. 7113 

STRAIN 090 frame: 3 

VGVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIRV 
QAGDMVGAS PANSGLLQDE PTVKT FNAMNVE YGTLGNHEFDEGLAEYNRI VTGKAPAPDS 
NINNITKSYPHEAAKQEIVVANVIDKVNKQIPYNWKPYAIKNIPVNNKSVNVGFIGIVTK 
DIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIVVLAHVPATSKDDIAEGEAAEM 
MKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFIE 
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T P S AKWAVAPGKKTG S AD I QAI VDQANT I VKQVTE AKI GT AE VS GM I T RS VDQDN VS P V 
GSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVVE 
ITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKVVKAYKSNGEEINPDA 
KYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPKIY 
VTMKMVNETITQNDGTHSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHKK 
QLHQFTAINPMRNYGKPSNSTTVKSKQ 

SEQ ID NO. 7114 

STRAIN A9G9 frame: 3 

WDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIRVQAGDMVG 
ASPANSGLLQDEPTVKTFNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPDSNINNITK 
SYPHEAAKQEIWANVIDKVNKQIPYNWKPYTIKNIPVNNKSVNVGFIGIVTKDIPNLVL 
RKN YE QYE FLDE AE T I VK YAKE LQAKNVKAI VVLAHVPAT S KD D I AEGE AAEMMKKVNQL 
FPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFIETPSAKVI 
AVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSPVGSLITEA 
QLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVVEITGRDLY 
KALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKVVKAYKSNGEEINPDAKYKLVIN 
DFLFGGGDGFASFRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPKIYVTMKMVN 
ETITQNDGTYSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHKKQLHQFTA 
INPMRNYGKPSNSTTVKSKQ 

SEQ ID NO. 7115 

STRAIN H3 6B frame: 2 

QVGVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIR 
VQAGDMVGASPANSGLLQDEPTVKTFNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPD 
SNINNITKSYPHEAAKQEIWANVIDKVNKQIPYNWKPYTIKNIPVNNKSVNVGFIGIVT 
KDIPNLVLRKNYEQYE FLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAAE 
MMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFI 
ETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSP 
VGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVV 
EITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINPD 
AKYKLVINDFLFGGGDGFASFRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPKI 
YVTMKMVNETITQNDGTYSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHK 
KQLHQFTAINPMRNYGKPSNSTTVKS 

SEQ ID NO. 7116 

STRAIN 18RS21 frame: 1 

DQ VG VQ V I G VN D FHG AL DN T G T ANM P D GKVXN AGT AAQ L D AYM D D AQK D FKQTN PN GE S I 
R VQ AG DM V GAS PAN SGLLQDEPTVKT FN AMN VE Y G T LGN HE F DE G L AE YNR I VT GKAP A P 
DSNINNITKS YPHEAAKQE I WANVI DKVNKQI PYNWKP YT IKNI PVNNKS VNVGFI GI V 
TKDIPNLVLRKNYEQYEFLDEAETIVKYAKELQABCNVKAIWLAHVPATSKDDIAEGEAA 
EMMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDF 
IETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVS 
PVGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQV 
VEITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINP 
DAKYKLVINDFLFGGGDGFASFRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPK 
IYVTMKMVNETITQNDGTYSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIH 
KKQLHQFTAINPMRNYGKPSNSTTVKSK 

SEQ ID NO. 7117 

STRAIN M732 frame: 3 

QVGVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIR 
VQAGDMVGAS PAN SGLLQDEPTVKT FNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPD 
SN INN ITKSYPHEAAKQEIWANVI DKVNKQI PYNWKPYTIKNI PVNNKS VNVGFIGIVT 
KD I PNLVLRKN YE QYE FLDE AET I VK YAKE LQAKNVKAI WLAHV PAT SKDD I AEGEAAE 
MMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFI 
ETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSP 
VGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVV 
EITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKVVKAYKSNGEEINPD 
AKYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVSIPNNKPKI 
YVTMKMVNETITQNDGTYSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHK 
KQLHQFTAINPMRNYGKPSNSTTVKSKQ 

SEQ ID NO. 7118 



304 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



STRAIN COH1 frame: 3 

QVGVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIR 
VQAGDMVGAS PANSGLLQDE PTVKTFNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPD 
SNINNITKSYPHEAAKQEIWANVIDKVNKQIPYNWKPYTIKNIPVNNKSVNVGFIGIVT 
KDIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAAE 
MMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFI 
ETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSP 
VGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVV 
EITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKVVKAYKSNGEEINPD 
AKYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVS I PNNKPKI 
YVTMKMVNETITQNDGTYSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHK 
KQLHQFTAINPMRNYGKPSNSTTVKS 

SEQ ID NO. 7119 

STRAIN M781 frame: 1 

Q VG VQ V I G VN D FHG AL DN T GT ANM P DGKVTN AGT AAQ L D A YM D D AQKD FKQT N PN GE SIR 
VQAGDMVGAS PANSGLLQDEPTVKT FN AMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPD 
SNINNITKSYPHEAAKQEIVVANVIDKVNKQIPYNWKPYTIKNIPVNNKSVNVGFIGIVT 
KDIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAAE 
MMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFI 
ETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSP 
VGSLITEAQ LAI ARKS WPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQW 
EITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKVVKAYKSNGEEINPD 
AKYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVS I PNNKPKI 
YVTMKMVNET I TQNDGT YS 1 1 KKLYLDRQGN I VAQE I VS DT LNQTKS KS TKINPVTT IHK 
KQLHQFTAINPMRNYGKPSNSTTVKS 

SEQ ID NO. 7120 

STRAIN CJB110 frame: 1 

DQVGVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESI 
RVQAGDMVGAS PANSGLLQDEPTVKT FN AMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAP 
DSNINNITKSYPHEAAKQEIWANVIDKVNKQIPYNWKPYAIKNIPVNNKSVNVGFIGIV 
TKDIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAA 
EMMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDF 
IETPSAKWAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVS 
PVGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQV 
VEITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINP 
DAKYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPK 
IYVTMKMVNETITQNDGTHSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIH 
KKQ LHQ FT A I N PMRN YGK P S N S T T VK S 

SEQ ID NO. 7121 

STRAIN 1169NT frame: 1 

QVGVQVIGVNDFHGALDNTGTANMPDGKVANAGTAAQLDAYMDDAQKDFKQTNPNGESIR 
VQAGDMVGASPANSGLLQDEPTVKNFNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPD 
SNINNITKSYPHEAAKQEIWANVIDKVNKQIPYNWKPYAIKNIPVNNKSVNVGFIGIVT 
KDIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKNDIAEGEAAE 
MMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFI 
ETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSVMITRSVDQDNVSP 
VGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVV 
EITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINPD 
AKYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVSV PNNKPKI 
YVTMKMVNETITQNDGTHSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHK 
KQLHQFTAINPMRNYGKPSNSTTVKS 

SEQ ID NO. 7122 

STRAIN JM9130013 frame: 2 

GVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIRVQ 
AG DMVGAS PANSGLLQDEPTVKT FNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPDSN 
INNITKS YPHEAAKQE I WANVI DKVNKQ I PYNWKPYT IKN I PVNNKS VNVGFIGIVTKD 
IPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAAEMM 
KKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFIET 
PSAKVIAVAPGKKTGSADIQAIVDQZ^NTIVKQVTEAKIGTAEVSGMITRSVDQDNVSPVG 
SLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVVEI 
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TGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINPDAK 
YKLVINDFLFGGGDGFASFRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPKIYV 
TMKMVNETITQNDGTYSIIEKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHKKQ 
LHQFTAINPMRNYGKPSNSTTVKSK 

SEQ ID NO. 7201 
STRAIN 2603 

ATGAATAAACGCGTAAAAATCGTTGCAACACTTGGTCCTGCGGTTGAATTCCGTGGTG 

GTAAGAAGTTTGGTGAGTCTGGATACTGGGGTGAAAGCCTTGACGTAGAAGCTTCAGCAG 

AAAAAATTGCTCAATTGATTAAAGAAGGTGCTAACGTTTTCCGTTTCAACTTCTCACATG 

G AGAT CAT GC T GAG C AAG G AG CT C GT AT GG CT AC T G T T CGT AAAG C AG AAG AG AT T G C AG 

GACAAAAAGTTGGCTTCCTCCTTGATACTAAAGGACCTGAAATTCGTACAGAACTTTTTG 

AAGATGGTGCAGATTTCCATTCATATACAACAGGTACAAAATTACGTGTTGCTACTAAGC 

AAGGT AT C AAAT C AAC T C C AG AAGT G AT T G CAT T GAAT GT T G C T G GT GG AC T T G AC AT C T 

TTGATGACGTTGAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTG 

TGTTTGCAAAAGATAAAGACACTCGTGAATTTGAAGTAGTTGTTGAGAATGATGGCCTTA 

T T G GT AAAC AAAAAG GT G T AAAC AT C C CT T AT ACT AAAAT TCCTTTCC C AG C AC T T G C AG 

AACGCGATAATGCTGATATCCGTTTTGGACTTGAGCAAGGACTTAACTTTATTGCTATCT 

CATTTGTACGTACTGCTAAAGATGTTAATGAAGTTCGTGCTATTTGTGAAGAAACTGGsm 

AT GG AC AC GT T AAGT T GT T T GC T AAAAT T GAAAAT C AAC AAG GT AT CG AT AAT AT T GAT G 

AGATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTATCGAAGTTC 

CATTTGAAATGGTTCCAGTTTACCAAAAAATGATCATTACTAAAGTTAATGCAGCTGGTA 

AAGC AGT T AT T AC AG C AAC AAAT AT G C T T gAAAC AAT G ACT GAT AAAC C AC G T G CG ACT C 

GTTCAGAAGTATCTGATGTCTTCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTT 

CAGGTGAGTCAGCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTG 

ATAAAAATGCTCAAACATTACTCAATGAGTATGGTCGCTTAGACTCATCTGCATTCCCAC 

GTAATAACAAAACTGATGTTATTGCATCTGCGGTTAAAGATGCAACACACTCAATGGATA 

TCAAACTTGTTGTAACAATTACTGAAACAGGTAATACAGCTCGTGCCATTTCTAAATTCC 

GT C C AG AT G C AG AC AT TTTGGCTGT T AC AT T T GAT G AAAAAGT AC AAC G T T CAT T GAT G A 

TTAACTGGGGTGTTATCCCTGTCCTTGCAGACAAACCAGCATCTACAGATGATATGTTTG 

AGGTTGCAGAACGTGTAGCACTTGAAGCAGGATTTGTTGAATCAGGCGATAATATCGTTA 

TCGTTGCAGGTGTTCCTGTAGGTACAGGTGGAACTAACACAATGCGTGTTCGTACTGTTA 

AA 

SEQ ID NO. 7202 

STRAIN 0 90 

AATAAACGCGTAAAAATCGTTGCAACACT 

TGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTGGAT 
ACTGGGGTGAAAGCCTTGACGTAGAAGCTTCAGCAGAAAAAATTGCTCAA 
T T G ATT AAAG AAG GT G C T AAC GT TTTCCGTTT C AAC T T CT C AC AT GG AG A 
TCATGCTGAGCAAGGAGCTCGTATGGCTACTGTTCGTAAAGCAGAAGAGA 
T T G C AG G AC AAAAAGT TGGCTTCCTCCTT GAT ACT AAAG G AC CT G AAAT T 
C GT AC AGAAC T T T T T G AAG AT GGT T C AG AT T T C CAT T CAT AT AC AAC AG G 
T AC AG AAT T ACGT GT T G C TACT AAG C AAG GT AT C AAAT C AAC T C C AG AAG 
TGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTTGAA 
GTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGTGTT 
T GC AAAAG AT AAAG AC ACT C g T GAAT T T G AAGT AGT T GT T GAG AAT GAT G 
GCCTTATTGGTAAACAaaaaGGTGTAAACATCCCTTATACTAaAATTCCT 
TTCCCAgCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACTTGA 
GCAAGGACTTAACTTTATTGCTATCTCATTTGTACGTACTGCTAAAGATG 
TTAATGAAGTTCGTGCTATTTGTGAAGAAACTGGCAATGGACATGTTAAG 
TT GTT T GCTAAAATT GAAAAT C AAC AAGGTAT CGAT AAT ATT GATGAG AT 
TAT C G AAG C AG C AG AT G GT AT TAT GAT TGCTCGTGGT GAT AT G G GT AT CG 
AAGTTCCATTTGAAATGGTTCCAGTTTACCAAAAAATGAT CAT TACT AAA 
GTTAATGCAGCTGGTAAAGCAGT TAT TACAGCAACAAATATGCTT GAAAC 
AATGACTGATAAACCACGTGCGACTCGTTCAGAAGTATCTGATGTCTTCA 
ATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCAGCT 
AATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGATAA 
AAAT G CT C AAAC AT T AC T C AAT G AGT AT GGT C G C T TAG AC T CAT C T G C AT 
TCCCACGTAATAACAAAACTGATGTTATTGCATCTGCGGTTAAAGATGCA 
AC AC ACT C AAT G GAT AT C AAAC T T G T T GT G AC AAT T AC T GAAAC AG GT AA 
TACAGCTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGGCTG 
TT AC AT T T GAT G AAAAAGT AC AAC GTT CAT T GAT GAT T AACT GGGGTGTT 
AT CCCTGTCCTTG C AG AC AAAC C AG C AT CT AC AG AT GAT AT GT T T G AG GT 
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TGCAGAACGTGTAGCACTTGAAGCAGGACTTGTTGAATCAGGCGATAATA 
T C G T TAT CGT TG C AG GT GT T C CT GT AG G T AC AGGT GGAACT AAC AC AAT G 
CGTGTTCGTACTGTTAAA 

SEQ ID NO. 7203 

STRAIN A909 

AATAAACGCGTAAAAATCGTTGCAACACTTGGTC 

CTGCGGTTGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTGGATACTGG 
G GT G AAAG C C T T G AC GT AG AAG CT T C AG C AG AAAAAAT T G C T C AAT T GAT 
T AAAG AAGGT GCT AACGTTTT C CGTT TCAACTTCTC AC AT GG AG AT C AT G 
C T GAG C AAGGAGCT CGT ATGGCTACTGT T CGTAAAGCAGAAGAGAT T GC A 
GGACAAAAAGTTGGCTTCCTCCTTGATACTAAAGGACCTGAAATTCGTAC 
AGAACTTTTTGAAGATGGTGCAGATTTCCATTCATATACAACAGGTACAA 
AAT TACGTGT TGCT ACT AAGCAAGGT AT C AAAT C AACT CC AGAAGT GAT T 
GCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTTGAAGTTGG 
TAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGTGTTTGCAA 
AAGAT AAAGACACT CGT GAAT T T GAAGT AGT TGTTGAGAAT GAT GGC CTT 
ATTGGTAAAC AAAAAGGT GT AAACAT CCCTT AT ACTAAAAT T CCT TT CC C 
AG C AC T T G C AG AACG CGAT AAT GCT GAT AT C C GT T T T GG AC T T GAG C AAG 
G AC T T AAC T T TAT T GCT AT C T CAT T T G T ACG T AC T G C T AAAgAT GT T AAT 
GAAGTTCGTGCTATTTGTGAAGAAACTGGCAATGGACACGTTAAGTTGTT 
T G CT AAAAT T GAAAAT C AAC AAG GT AT C GAT AAT AT T GAT GAG AT TAT C G 
AAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTATCGAAGTT 
CC AT T T G AAAT GG T T C C AGT T T AC C AAAAAAT GAT CAT T AC T AAAGT T AA 
T G C AG C T G GT AAAG C AGT TAT T AC AG C AAGAAAT AT GCT T GAAAC AAT G A 
CTG AT AAACC ACGT GCG ACT CGTT C AGAAGT AT CTGATGTCTTC AAT GCT 
GTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCAGCTAATGG, 
T AAAT AC C C AGT T G AGT C AGT T C GT AC AAT G G CT AC T AT T G AT AAAAAT G 
CTCAAACATTACTCAATGAGTATGGTCGCTTAGACTCATCTGCATTCCCA 
CGT AAT AACAAAACT GAT GT T ATTGCAT CTGCGGT T AAAG AT GCAAC AC A 
CT C AAT G GAT AT C AAACT T GT T GT AAC AAT TACT GAAAC AGGT AAT AC AG 
CTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGGCTGTTACA 
TTTGATGAAAAAGTACAACGTTCATTGATGATTAACTGGGGTGTTATCCC 
TGTCCTTG C AGAC AAAC C AG CAT C T AC AG AT G AT AT GT T T G AGGT T G C AG 
AACGTGTAGCACTTGAAGCAGGATTTGTTGAATCAGGCGATAATATCGTT 
AT CGT T G C AG GTGTTCCT GT AG G T AC AG GT G G AAC T AAC AC AAT G C G TG T 
TCGTACTGTTAAA 

SEQ ID NO. 7204 

STRAIN H36B 

AAT AAAC G C GT AAAAAT CGTT GCAAC 

ACTTGGTCCTGCGGTTGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GAT ACT G GG GT G AAAG C C T T G AC GT AG AAG CTT C AG C AG AAAAAAT TGCT 
CAATTGATTAAAGAAGGTGCTAACGTTTTCCGTTTCAACTTCTCACATGG 
AG AT C ATG CT GAG C AAG G AGCT CG TAT GGC TAG T G T T C GT AAAG C AG AAG 
AG AT T GC AG GAC AAAAAGT T G G CT T C CT C CT T G AT ACT AAAGG AC CT G AA 
AT T CGT AC AG AAC T T T T T G AAG AT GGT G C AG AT T T CC AT T CAT AT AC AAC 
AGGT AC AAAAT TACGTGT TGCT ACT AAGCAAGGT AT C AAAT C AACT CC AG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GT T T GC AAAAG AT AAAG AC AC T C GT GAAT T T G AAG T AGTT GT T GAG AAT G 
AT GG C CT T AT T GGT AAAC AAAAAGGTGT AAAC AT C CCT TAT ACT AAAAT T 
CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACT 
T G AGC AAGG ACTT AACTTT AT TGCT AT CT CAT TTGT ACGT ACTGCT AAAG 
AT GT T AAT G AAG T T C GT G CT AT T T G T G AAG AAAC T G G C AAT GG AC AC GT T 
AAGTTGTTTGCTAAAATTGAAAATCAACAAGGTATCGATAATATTGATGA 
GAT TAT C G AAG C AG C AG AT G GT AT T AT GAT TGCT C GT GGT G AT AT G G GT A 
T CG AAGT T C CAT T T G AAAT G GT T C C AG T T T AC C AAAAAAT GAT CAT TACT 
AAAGTTAATGCAGCTGGTAAAGCAGTTATTACAGCAACAAATATGCTTGA 
AAC AAT G Ac T GAT AAAC C AC GT G C G AC T CGT T C AG AAGT AT C T GAT GT C T 
T C AAT GCT G T TAT T GAT G GT ACT GAT G CT AC AAT G C T T T C AG G T G AGT C A 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
TAAAAATGCTCAAACATTACTCAATGAGTATGGTCGCTTAGACTCATCTG 
CAT T C C C AC GT AAT AAC AAAACT G AT GT T AT T G CAT CTGCGGTT AAAGAT 
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G C AAC AC ACT C AAT GG AT AT C AAACT T GT T GT AAC AAT T AC T G a AAC AGG 
T AAT AC AG C T CGT G C CAT T T C T AAAT T C C GT C C AG AT G C AG AC AT T T T GG 
C T GT T AC AT T T GAT G AAAAAG T AC AAC GT T CAT T GAT G AT T AACT GG G GT 
GT T AT CCCTGTCCT T GC AG AC AAAC C AG CAT CT AC AG AT GAT AT GT T T GA 
G GT T G C AG AACGT GT AG C AC T T GAAG C AG GAT T T G T T GAAT C AGGCG AT A 
ATATCGTTATCGTTGCAGGTGTTCCTGTAgGTACAGGTGGAACTAACACA 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7205 

STRAIN 18RS21 

AATAAACGCGTAAAAATCGTTGCAAC 

ACTTGGTCCTGCGGTTGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GAT ACT GGGGT G AAAG C C T T G AC GT Ag AAG C T T C AG C AG AAAAAAT T G CT 
C AAT T GAT TAAAG AAGGT GC T AACGT TTTCCGTTT C AACT T CT C AC AT G G 
AG AT CAT G C T GAG C AAGG AG CT C GT AT GG C T AC T GT T C GT AAAG C AG AAG 
AG AT T G C AG G AC AAAAAGT TGGCTTCCTCCTT GAT AC T AAAGGAC CT G AA 
ATTCGTACAGAACTTTTTGAAGATGGTGCAGATTTCCATTCATATACAAC 
AG GT AC AAAAT TAG GT GT T G CT AC T AAG C AAG GT AT C AAAT C AAC T C C AG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
G AAG TT GGT AAGC AAAT C CT T GT T GAT GAT GGT AAACT AGGT C T TACT G T 
GT T T G C AAAAGAT AAAG AC ACT C GT G AAT T T G AAGT AG T T GT T GAG AAT G 
ATGGCCTTATTGGTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
C CTT T C C C AG C ACT T G C AG AAC GC GAT AAT G CT G AT AT C C GT T T T GG AC T 
TGAGCAAGGACTTAACTTTATTGCTATCTCATTtGTACGTACTGCTAAAG 
AT GT T AAT G AAGT T C GT G C T AT T T GT G AAGAAACT GG C AAT G G AC ACGT T 
AAGT T GTT TGCT AAAAT T GAAAAT C AAC AAGGT AT CG AT AAT AT T GAT G A 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C G AAGT T C CAT T T G AAAT GGT T C C AGT T T AC C AAAAAAT GAT CAT TACT 
AAAGT T AAT G C AG C T GGT AAAG C AGT TAT T AC AG C AAC AAAT AT G C T T G A 
AACAATGaCTGATAAACCACGTGCGACTCGTTCAGAAGTATCTGATGTCT 
TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCA 
G CT AAT G GT AAAT AC C C AGT T GAG T C AGT T CGT AC AAT GGC TAG TAT T GA 
T AAAAAT G C T C AAAC AT T AC T C AAT GAG TAT GGT C G CT T AG ACT CAT C T G 
CATTCCCACGTAATAACAAAACTGATGTTATTGCATCTGCGGTTAAAGAT 
G C AAC AC AC T C AATG GAT AT C AAACT T GT T GT AAC AAT TAG T G AAAC AG G 
TAATACAGCTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGG 
CT GT T AC AT T T GAT G AAAAAG T AC AAC GT T CAT T GAT GAT T AAC T G G G GT 
GT T AT CCCTGTCCTTG C AG AC AAAC C AG CAT C T AC AG AT GAT AT GT T T G A 
GGTTGCAGAACGTGTAgCACTTGAAGCAGGATTTGTTGAATCAGGCGATA 
AT AT C GT T AT CGT T GC AGGT G T T C CT GT AgG T AC AG GT G G AAC T AAC AC A 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7206 

STRAIN M732 

AATAAACGCGTAAAAATCGTTGCAAC 

ACTTGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
G AT ACT GGGGT GAAAGCCTTGACGT AG AAGCTTCAGC AG AAAAAAT TGCT 
CAATTGATTAAAGAAGGTGCT AACGT TTTCCGTTTCAACTTCTC AC AT GG 
AGAT CAT GCT G AG C AAG GAG C T CG T AT G G CT AC T GT T CG TAAAG C AG AAG 
AGATTGCAGGACAAAAAGTTGGCTTCCTCCTTGATACTAAAGGACCTGAA 
AT T C GT AC AG AAC T T T T T GAAG AT GGT G C AG AT T T C CAT T CAT AT AC AAC 
AGGTACAAAATTACGTGTTGCTACTAAGCAAGGTATCAAATCAACTCCAG 
AAGT GAT TGCATT GAAT GTT GCT GGT GGACTTGACATCTTT GAT GACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GTTTG C AAAAGAT AAAG AC ACT CGTGAATTTGAAGT AGT T GTT GAG AATG 
ATGGCCTTATTGGTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACT 
TGAGCAAGGACTTAACTTTATTGCTATCTCATTTGTACGTACTGCTAAAG 
AT G T T AAT GAAG T T C GT G C T AT T T GT GAAG AAAC T GG C AAT GG AC ACGT T 
AAGT TGTTTGCT AAAAT T GAAAAT C AAC AAG GT AT C GAT AAT AT T GAT G A 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
TC G AAGT T C CAT T T G AAAT G GT T C C AGTT T AC C AAAAAAT GAT CAT T AC T 
AAAGTTAATGCAGCTGGTAAAGCAGTTATTACAGCAACAAATATGCTTGA 
AAC AAT G ACT GAT AAAC C AC GT GC G AC T CGT T C AG AAGT AT C T GAT GT C T 
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TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCA 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
T AAAAAT G CT C AAAC AT T AC T C AAT G AGT AT GGT C GCT T AGAC T GAT C T G 
CATTCCCACGTAATAACAAAACTGATGTTATTGCATCTGCGGTTAAAGAT 
GC AAC AC ACT C AAT GG AT AT C AAACT T GT T G T AAC AAT T AC T G AAAC AG G 
TAATACAGCTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGG 
CTGTTACATTTGATGAAAAAGTACAACGTTCATTGATGATTAACTGGGGT 
GT T AT CCCTGTCCTTG C AG AC AAAC C AGC AT CT AC AG AT GAT AT GT T T G A 
GGTTGCAGAACGTGTAgCACTTGAAGCAGGACTTGTTGAATCAGGCGATA 
AT AT C GT T AT C G T T G C AGGT G TT C C T GT AG GT AC AG GT GG AACT AAC AC A 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7207 

STRAIN COH1 

AAT AAACGCGT AAAAAT CGTTGC AAC 

ACTTGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GATACTGGGGTGAAAGCCTTGACGTAGAAGCTTCAGCAGAAAAAATTGCT 
C AAT T G AT T AAAG AAGGT G CT AACGT T TTCCGTTT C AAC T T CT C AC AT G G 
AG AT CAT GCT G AG C AAGG AG C T C G T AT GGCT AC T GT T C GT AAAG C AG AAG 
AGATTGCAGGACAAAAAGTTGGCTTCCTCCTTGATACTAAAGGACCTGAA 
AT T C GT AC AG AAC t T T T T G AAG AT G GT G C AG AT T T C CAT T CAT AT AC AAC 
AG GT AC AAAAT T ACGT GT T G CT AC T AAG C AAG G TAT C AAAT C AACT C C AG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GT T T G C AAAAGAT AAAG AC AC T C GT G AAT T T GAAGT AG T T GT T GAG AAT G 
ATGGCCTTAtTGGTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGgACT 
T G AG C AAGG AC T T AAC T T TAT T G C TAT C T CAT T T GT AC GT ACT G C T AAAG 
ATGTTAATGAAGTTCGTGCTATTTGTGAAGAAACTGGCAATGGACACGTT 
AAGTTGTTTGCTAAAATTGAAAATCAACAAGGTATCGATAATATTGATGA 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C GAAGT T C CAT T T GAAAT GGT T C C AGT T T AC C AAAAAAT GAT C AT T AC T 
AAAGT T AAT GC AG C T G GT AAAG C AGT TAT T AC AGC AAC AAAT AT G C T T G A 
AACAATGACTGATAAACCACGTGCGACTCGTTCAGaAGTATCTGATGTCT 
TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTtTCAGGTGAGTCA 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
TAAAAATGCTCAAACATTACTCAATGAGTATGGTCGcTTAGACTCATCTG 
CAT T C C C AC GT AAT AAC AAAACT G AT GT TAT T G CAT C T GC GGT T AAAG AT 
G C AAC AC AC T C AAT GGAT AT C AAACT T G T T GT AAC AAT TACT G AAAC AG G 
T AAT AC AGC T C GT GC CAT T T C T AAAT T C CGT C C AG AT G C AG AC AT T T T GG 
CTGTTACATTTGATGAAAAAGTACAACGTTCATTGATGATTAACTGGGGT 
GT T AT CCCTGTCCTT GC AGAC AAAC C AGC AT C T AC AG AT GAT AT GT T T G A 
GGT T G C AGAAC GT GT AGC ACT T GAAG C AGG AC T T G T T G AAT C AG G C GAT A 
ATATCGTTATCGTTGCAGGTGTTCCTGTAGGTACAGGTGGAACTAACACA 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7208 

STRAIN M7 81 

AATAAACGCGTAAAAATCGTTGCAAC 

ACTTGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GAT AC T GG G GT G AAAG C C T T G AC GT AG AAG C T T C AGC AG AAAAAAT T G C T 
C AAT TG ATT AAAGAAGGT GCT AACGT TTTC CGT TTC AACT TCTC AC AT GG 
AGATCATGCTGAGCAAGGAGCTCGTATGGCTACTGTTCGTAAAGCAGAAG 
AG AT T G C AGG AC AAAAAGT T GGC T TCCTCCTT GAT AC T AAAG G AC C T G AA 
AT T C G T AC AG AAC T T T T T GAAG AT GGT G C AG AT TTC CAT T CAT AT AC AAC 
AG GT AC AAAAT TACGTGTT G CT ACT AAG C AAGGT AT C AAAT C AAC T C C AG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GT T T G C AAAAGAT AAAG AC AC T C G T G AAT T T GAAGT AGT T GT T GAG AAT G 
ATGGCCTTATTGgTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
CCTTTCCCAGCACTTGCAGaaCGCGATAATGCTGATATCCGTTTTGGACT 
T GAGCAAGGACT T AACTTT AT T GCTAT CT CATTTGT ACGT ACT GCT AAAG 
ATGTTAATGAAGTTCGTGCTATTTGTGAAGAAACTGGCAATGGACACGTT 
AAG T T G T T T G C T AAAAT T G AAAAT C AAC AAG GT AT C GAT AAT AT T GAT G A 
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GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C GAAGT T C CAT T T GAAAT GGT T C C AGT T TAG C AAAAAAT GAT CAT TACT 
AAAGTTAATGCAGCTGGTAAAGCAGTTATTACAGCAACAAATATGCTTGA 
AACAATGACTGATAAACCACGTGCGACTCGTTCAGAAGTATCTGATGTCT 
TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCA 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
TAAAAATGCTCAAACATTACTCAATGAGTATGGTCGCTTAGACTCATCTG 
CATTCCCACGTAATAACAAAACTGATGTTATTGCATCTGCGGTTAAAGAT 
GCAACACACT C AATGGAT ATCAAACTTGT T GT AAC AAT T ACT GAAACAGG 
T AAT AC AG CT CGT G C CAT T T CT AAGT T C C GT C C AG AT G C AG AC AT T T T G G 
CTGTTACATTTGATGAAAAAGTACAACGTTCATTGATGATTAACTGGGGT 
GT T AT C C C T GT C C T T G C AG AC AAAC C AG CAT C T AC AGAT GAT AT G T T T GA 
GG T T G C AG AAC GT GT AGC ACT T G AAGC AGG AC T T G T T GAAT C AGG CG AT A 
ATATCGTTATCGTTGCAGGTGTTCCTGTAGGTACAGGTGGAACTAACACA 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7209 

STRAIN CJB110 

AATAAACGCGTAAAAATCGTTGCAAC 

ACTTGGTCCTGCGGTTGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GAT AC T G G GGT G AAAG C CT T G AC GT Ag AAG CT T C AG C AGAAAAAAT T G C T 
CAATTGATTAAAGAAGGTGCTAACGTTTTCCGTTTCAACTTCTCACATGG 
AGAT CATGCTGAGCAAGGAGCT CGT ATGGCTACTGTTCGTAAAGCAGAAG 
AGAT T G C AG GAC AAAAAGT T GGCTTCCTCCTT GAT AC T AAAGG AC CT G AA 
AT T C GT AC AG AAC T T T T T G AAG AT G GT G C AG AT T T C CAT T CAT AT AC AAC 
AGG T AC AAAAT T AC G T GT T GC T ACT AAG C AAGGT AT C AAAT C AAC T CC AG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
G T T T G C AAAAG AT AAAG AC AC T C GT GAAT T T GAAGT AG T T G T T GAG AAT G 
ATGGCCTTAtTGGTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACT 
T G AAC AAG GAC T T AACT T T AT T G C T AT CT CAT T T GT ACGT ACT G C T AAAG 
ATGTTAATGAAGTTCGTGCTATTTGTGAAGAAACTGGCAATGGACACGTT 
AAGT T GT T T G C T AAAAT T GAAAAT C AAC AAG GT AT C GAT AAT AT T GAT G A 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T CG AAGT TCCAT TT GAAATGGTT CC AGT TTACC AAAAAAT GAT C ATT ACT 
AAAGTTAATGCAGCTGGTAAAGCAGTTATTACAGCAACAAATATGCTTGA 
AACAAT GACT GAT AAACC ACGT GCGACT C GT TC AGAAGT AT CT GAT GT CT 
TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCA 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
TAAAAATGCTCAAACATTACTCAATGAGTATGGTCGCTTAGACTCATCTG 
CATT C CCACGT AAT AAC AAAACTGAT GT T AT T GCAT CT GCGGTTAAAGAT 
G C AAC AC AC T C AAT G GAT AT C AAACT T GT T G T AAC AAT T AC T GAAACAGG 
T AAT AC AG C T C G T G C CAT T T CT AAAT T C C GT CC AG AT G C AG AC AT T T T GG 
C T G T T AC AT T TG AT G AAAAAGT AC AACGT T CAT T GAT G AT T AAC T G G G GT 
G T TAT CCCTGTCCTTG C AG AC AAAC C AG CAT C T AC AG AT GAT AT GT T T G A 
GGT T G C AG AAC GT GT AG C AC T T G AAG C AGG AT T T GT T GAAT C AG G C GAT A 
ATATCGtTATCGTTGCAGGTGTTCCTGTAGGTACAGGTGGAACTAACACA 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7210 

STRAIN 1169NT 

AATAAACGCGTAAAAATCGTTGCAAC 

ACTTGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GAT ACT GGGGTGAAAGCCTTGACGT AG AAGCTTCAGC AGAAAAAAT TGCT 
C AAT T GAT T AAAG AAG GT G CT AAC GT T TT C CG T T T C AAC T T CT C AC AT GG 
AGAT CATGCTGAGCAAGGAGCT CGT ATGGCTACTGTTCGTAAAGCAGAAG 
AGAT TGC AGG AC AAAAAGT TGGCTTCCTCCTT GAT ACT AAAGG ACCTGAA 
AT T C G T AC AG AAC T T T T T G AAG AT G GT G C AG AT T T C C AT T CAT AT AC AAC 
AGGTACAAAATTACGTGTTGCTACTAAGCAAGGTATCAAATCAACTCCAG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
G T T T G C AAAAG AT AAAG AC ACT CGT GAAT T T GAAGT AG T T G T T GAG AAT G 
ATGGCCTTATTGGTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
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CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACT 
TGAGCAAGGACTTAACTTTATTGCTATCTCATTTGTACGTACTGCTAAAG 
AT GT T AAT G AAGT T C GT G C TAT T T GT G AAGAAAC T GG C AAT GG AC ACGT T 
AAG T T GT T T G c T AAAAT T G AAAAT C Aa C AAGG T AT CG AT AAT AT T GAT GA 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C G AAGT T C CAT T T GAAAT G GT T C C AGT T T AC C AAAa AAT GAT CAT TACT 
Aa AGT T AAT G C AGCT GGT AAAG C AG T TAT TAG AG C AAC AAAT AT G C T T G A 
AAC AAT GAC T GAT AAAC C ACGT G C G ACT C GT T C AG AAGT AT CT G AT GT C T 
T C AAT G C T GT TAT T GAT GGT AC T GAT G C T AC AAT G C T T T C AGGT G AGT C A 
G C T AAT GGT AAAT AC C C AGT T G AGT C AGT T CG T AC AAT GG C TACT AT T G A 
TAAAAATGCTCAAACAttACTCAATGAGTATGGTCGTTTAGACTCATCTG 
CAT T C C C AC G T AAT AAC AAAAC T GAT GT TAT T G CAT C T G C GGT T AAAG AT 
G C AAC AC AC T C AAT G G AT AT C AAAC TT G T T G T AAC AAT TACT G AAAC AGG 
T AAT AC AG CT C GT GC C AT T T C T AAAT T C C GT C C AG AT G C AGAC AT T T T G G 
CTGTTACATTTGATGAAAAAGTACAACGTTCATTGATGATTAACTGGGGT 
G T T AT C C C T GT C CT T G C AG AC AAAC C AG CAT C T AC AG AT GAT AT G T T T G A 
G GT T G C AGAAC GT GT AG C ACT T GAAG C AGG ACT T GT T G AAT C AGG C GAT A 
AT AT CGT TAT C GT T G C AG GT G T T C C T G T AGGT AC AG GT GG AAC T AAC AC A 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7211 

STRAIN JM9130013 

AAT AAACGCGT AAAAAT C GT T G C AAC 

ACTTGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GATACTGGGGTGAAAGCCTTGACGTAGAAGCTTCAGCAGAAAAAATTGCT 
C AAT T G AT T AAAG AAGGT G CT AACG TTTTCCGTTT C AAC T T CT C AC AT GG 
AG AT CAT GC T GAG C AAGG AG CT C GT AT GGCTACTGTTC G T AAAG C AG AAG 
• AG AT T GC AGG AC AAAAAGT T GGC T TCCTCCTT GAT ACT AAAG GAC C T G AA 
AT T C GT AC AG AACT T T T T G AAGAT G GT T C AG AT T T C CAT T CAT AT AC AAC 
AG GT AC AAAAT T AC GT GT T G C TACT AAG C AAG GT AT C AAAT C AACT C C AG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GTT T GCAAAAGAT AAAGACACTCGT GAATTT GAAGT AGTT GTTGAGAATG 
ATGGCCTTATTGGTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACT 
T GAG CAAGG ACTT AACT TTATTGCT AT CTC AT TTGT ACGT ACTGCT AAAG 
AT GT T AAT GAAGT T CGT GCT AT T T GT GAAG AAAC T G G C AAT GG AC AT GTT 
AAGT T GT T T G CT AAAAT T G a AAAT C Aa C AAG G TAT C GAT AAT AT T GAT G A 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C GAAGT T C CAT T T GAAAT GGT T C C AG T T T AC C AAAAAAT GAT C AT T AC T 
AAAGTTAATGCAGCTGGTAAAGCAGTTAttACAGCAACAAATATGCTTGA 
AAC AAT GAC T GAT AAAC C ACGT G CG AC T CGT T C AGAAGT AT CT G AT GT CT 
TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCA 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
TAAAAATGCTCAAACATTACTCAATGAGTATGGTCGCTTAGACTCATCTG 
CATTCCCACGTAATAaCAAAACTGATGTTATTGCATCTGCGGTTAAAGAT 
GCAACACACTCAATGGATATCAAACTTGTTGTGACAATTACTGAAACAGG 
TAATACAGCTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGG 
C T GT T AC AT T T GAT G AAAAAG T AC AAC GT T CAT T GAT GAT T AAC T G G G GT 
GT T AT CCCTGTCCTTG C AG AC AAAC C AG CAT C T AC AG AT G AT AT GT T T GA 
GGTTGCAGAACGTGTAgcACTTGAAGCAGGACTTGTTGAATCAGGCGATA 
ATATCGTTATCGTTGCAGGTGTTCCTGTAGGTACAGGTGGAACTAACACA 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7212 
STRAIN 2603 frame: 1 

MNKRVKIVATLGPAVEFRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHG 
DHAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQ 
GIKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVVVENDGLI 
GKQKGVNIPYTKIPFPALAERDNADIRFGLEQGLNFIAISFVRTAKDVNEVRAICEETGX 
GHVKLFAKIENQQGIDNIDEIIEAADGXMIARGDMGIEVPFEMVPVYQKMIITKVNAAGK 
AVITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATID 
KNAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLVVTITETGNTARAISKFR 
PDADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGFVESGDNIVI 
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VAGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7213 

STRAIN 0 90 frame: 1 

NKRVKIVATLGPAVE FRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGSDFHSYTTGTELRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVWENDGLIG 
KQKG VN IPYTKIPF P ALAE R DN AD I R FG LE QGLN F I AI S FVRT AK D VN E VRAI C E E T GNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGLVESGDNIVIV 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7214 

STRAIN A909 frame: 1 

NKRVKIVATLGPAVE FRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEWVENDGLIG 
KQKGVNIPYTKIPFPALAERDNADIRFGLEQGLNFIAISFVRTAKDVNEVRAICEETGNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
V I T ATNMLETMT DK PRATRS E VS DVFN AVI DGT DATML S GE S ANGKY PVE S VRTMAT I DK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLVVTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGFVESGDNIVIV 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7215 

STRAIN H3 6B frame: 1 

NKRVKIVATLGPAVE FRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVWENDGLIG 
KQKGVNIPYTKIPFPALAERDNADIRFGLEQGLNFIAISFVRTAKDVNEVRAICEETGNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DADI LAVT FDEKVQRS LMINWGVI PVLADKPAST DDMFEVAERVALEAG FVE S GDNI VI V 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7216 

STRAIN 18RS21 frame: 1 

NKRVKIVATLGPAVE FRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEWVENDGLIG 
KQKGVNIPYTKIPFPALAERDNADIRFGLEQGLNFIAISFVRTAKDVNEVRAICEETGNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLVVTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGFVESGDNIVIV 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7217 

STRAIN M732 frame: 1 

NKRVKIVATLGPAVE FRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEWVENDGLIG 
KQKGVNIPYTKIPFPALAERDNADIRFGLEQGLNFIAISFVRTAKDVNEVRAICEETGNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLVVTITETGNTARAISKFRP 
DADILAVTFDEKVQRS LMINWGVI PVLADKPASTDDMFEVAERVALEAGLVES GDNI VI V 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7218 

STRAIN COH1 frame: 1 
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NKRVKIVATLGPAVEFRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVVVENDGLIG 
KQKGVNIPYTKIPFPALAERDNADIRFGLEQGLNFIAISFVRTAKDVNEVRAICEETGNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
V I T ATNMLETMT DKPRATRS E VS DV FN AV I DGT D ATML S GE S ANGKY PVE S VRTMAT I DK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGLVESGDNIVIV 
AG V P VGT G GTN TMRVRT VK 

SEQ ID NO. 7219 

STRAIN M7 81 frame: 1 

NKRVKIVATLGPAVEFRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVWENDGLIG 
KQKGVNI PYTKI P FPALAERDNADIRFGLEQGLNFI AI S FVRTAKDVNEVRAI CEETGNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMT DKPRATRS EVSDVFNAVI DGT DATMLSGE S ANGKYPVES VRTMAT I DK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DAD I LAVT FDEKVQRS LMINWGVI PVLADKPAST DDMFE VAERVALE AGLVE SGDN I VI V 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7220 

STRAIN CJB110 frame: 1 

N KR VK I VAT L G PAVE FRGGKK FGE SGYWGESL D VE AS AE K I AQ L I KE G AN V FR FN F S HG D 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVWENDGLIG 
KQKGVNI PYTKI PFPALAERDNADIRFGLEQGLNFIAI S FVRTAKDVNEVRAI CEETGNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLVVTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVI PVLADKPAST DDMFEVAERVALEAGFVE SGDNI VI V 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7221 

STRAIN 1169NT frame: 1 

NKRVKIVATLGPAVEFRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVWENDGLIG 
KQKGVN I PYTKI PFPALAERDNADIRFGLEQGLNFIAI S FVRTAKDVNEVRAI CEETGNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVI PVLADKPASTDDMFEVAERVALEAGLVESGDNIVIV 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7222 

STRAIN JM9130013 frame: 1 

NKRVKIVATLGPAVEFRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGSDFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEWVENDGLIG 
KQKGVNI PYTKI PFPALAERDNADIRFGLEQGLNFIAI S FVRTAKDVNEVRAI CEETGNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DAD I LAVT F D E KVQR S LM I N WG V I P VL AD K PA S T D DM FE V AE RV ALE AG L VE S G DN IV I V 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7301 
STRAIN 2603 

TTGTCTGCTATAATAGACAAAAAGGTGGTGATATTTATGTATTTAGCATTAATCGGTGAT 
AT C AT T AAT T C AAAAC AG AT ACT T G AAC GT G AAACT T T C C AAC AG T C T T T T C AG C AACT A 
ATGACCGAACTATCTGATGTATATGGTGAAGAGCTGATTTCTCCATTCACTATTACAGCT 
G GT G AT G AAT T T C AAG CT T TAT T G AAAC CAT C AAAAAAGG T AT T T C AAAT TAT T G AC CAT 
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ATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGTACAGGAAACATTATA 
ACATCCATCAATTCAAATGAAAGTATCGGTGCTGATGGTCCTGCCTACTGGCATGCTCGC 
T C AGCT AT T AAT CAT AT AC AT GAT AAAAAT GAT TAT G G AAC AG T T CAAGT AG CT AT T T G C 
C T T GAT GAT GAAGAC C AAAAC CT T G AAT T AAC ACT AAAT AGT CT CAT T T C AGCT G GT GAT 
T T TAT C AAGT C AAAAT GGACT AC AAAC CAT T T T C AAAT GCT T G AGC ACT T AAT AC T T C AA 
GATAATTATCAAGAACAATTTCAACATCAAAAGTTAGCCCAACTGGAAAATATTGAACCT 
AG T G CG CT G ACT AAACGC CT T AAAG C AAG C GG T C T GAAGAT T T ACT T AAGAAC GAGAAC A 
CAGGCAGCCGATCTATTAGTTAAAAGTTGCACTCAAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7302 

STRAIN 090 

TCTGCTATAATAGACAAAAAGGTGGTGATATTTATGTATTT 
AGC ATTAAT CGGT GAT AT CAT T AAT T C AAAAC AG AT AC T T GAAC GT GAAA 
CT T T C C AAC AGT CT T T T C AG C AAc T AAT GAC CG AACT AT c T G AT GT AT AT 
GGT G AAG AGC T GAT T T C T C CAT T C AC T AT T AC AG C T GGT GAT G AAT T T C A 
AGCTT T AT T G AAAC CAT CAAAAAAGGT ATT TC AAAT T ATT GAC CATATTC 
AACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGtACAGGAAAC 
ATTAT AAC AT CC AT CAATTTAAAT GAAAGT AT CGGT GCTGATGGT C CT GC 
CTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATAAAAATGATT 
AT GG AAC AGT T C AAG TAG C TAT T T G C C TT G AT GAT G AAGACCAAAAC CT T 
GAAT T AAC ACT AAAT AGT C T CAT T T C AG C T GG T GAT T T T AT C AAGT C AAA 
ATGGACTACAAACCATTTTCAAATGCTTGAGCACTTAATACTTCAAGATA 
AT T AT C AAGAAC AAT TTC AAC ATCAAAAGTT AGC CC AACT GG AAAAT ATT 
GAAC C T AGT G CG C T G AC T AAAC GC C T T AAAG C AAGC G GT C T GAAGAT T T A 
CT T AAG AAC GAGAAC AC AG GC AGC C GAT C TAT T AGT T AAAAG T T G CAC T C 
AAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7303 

STRAIN A909 

T CT G C T AT AAT AG AC AAAAAG GT GGT GAT AT T T AT G T AT 
TT AGCATT AAT CGGT GAT AT CAT T AAT T CAAAAC AGAT ACT T GAAC GT GA 
AACTTTCCAACAGTCTTTTCAGCAACTAATGACCGAACTATCTGATGTAT 
AT GGT G AAG AG CT G AT T T C T C CAT T CAC TAT T AC AG C T G GT GAT GAAT T T 
CAAGCTTTATTGAAACCATCAAAAAAGGTATTTCAAATTATTGACCATAT 
TCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGTACAGGAA 
ACATT AT AACAT CCAT C AATT CAAAT GAAAGTATCGGTGCT GAT GGT CCT 
G C CT AC T G G C AT GCT CG C T C AG C TAT T AAT CAT AT AC AT GAT AAAAAT G A 
TTATGGAAC AGT T CAAGT AGCT AT TT GCCTT GAT GAT GAAGAC CAAAACC 
T T GAAT T AAC AC T AAAT AGT CT C AT TT C AG C T G GT GAT T T T AT C AAG T C A 
AAAT GG AC T AC AAAC CAT T T T CAAAT G CT T GAG C AC T T AAT AC T T C AAG A 
TAATTAT C AAGAAC AAT TT CAACAT CAAAAGTT AGCCCAACT GGAAAAT A 
TTG AACCT AGT GCGCTGACT AAACGC CTTAAAGC AAG CGGT CTG AAG ATT 
T ACT T AAGAAC GAGAAC AC AG GC AGC C GAT C T AT TAG T T AAAAGT T G CAC 
TCAAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7304 

STRAIN H3 6B 

TCTGCTATAATAGACAAAAAGGTGGTGATATTT 

ATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACTTGA 
ACGTGAAACTTTCCAACAGTCTTTTCAGCAACTAATGACCGAACTATCTG 
AT GT AT AT GGT G AAG AG CT GAT T T C T C CAT T C ACT AT T AC AG C T G GT GAT 
GAATTT CAAGCT TT ATT GAAACCAT C AAAAAAGGTAT TT C AAATT AT T GA 
CCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGTA 
C AG GAAAC AT TAT AAC AT C CAT C AAT T CAAAT G AAAG T AT C G GT G CT GAT 
GGTCCTGCCTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATAA 
AAA.TGATTATGGAAC AGT T CAAGT AGCT AT TT GCCT T GATGATGAAGACC 
AAAAC CT T G AAT T AAC AC T AAAT AGT C T CAT T T C AG C T GGT GAT T T T AT C 
AAGTCAAAATGGACTACAAACCATTTTCAAATGCTTGAGCACTTAATACT 
T C AAGAT AAT TAT C AAG AAC AAT T T CAACAT CAAAAGTT AGCC C AACT GG 
AAAAT AT T G AAC CT AGT GC G CT G ACT AAAC G C CT T AAAG C AAG C G GT C T G 
AAGAT T T AC T T AAG AAC G AG AAC AC AGG C AG C C GAT C T AT T AG T T AAAAG 
TTGCACTCAAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7305 
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STRAIN 18RS21 

TCTGCTATAATAGACAAAAAGGTGGTGATATTT 

ATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACTTGA 
ACGTGAAACTTTCCAACAGTCTTTTCAGCAACTAATGACCGAACTATCTG 
AT GT AT AT G GT GAAGAG C T GAT T T CT C CAT T C AC TAT T AC AG CT GGT GAT 
GAATTTCAAGCTTTATTGAAACCATCAAAAAAGGTATTTCAAATTATTGA 
CCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGTA 
CAGGAAACATTATAACATCCATCAATTCAAATGAAAGTATCGGTGCTGAT 
GGTCCTGCCTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATAA 
AAAT GAT TAT GGAAC AGT T C AAGT AG C T AT T T GC C T T GAT GAT G AAG AC C 
AAAACCTTGAATTAACACTAAATAGTCTCATTTCAGCTGGTGATTTTATC 
AAG T C AAAAT GGAC T AC AAAC CAT T T T C AAAT GC T T GAG C AC T T AAT AC T 
T C AAG AT AAT T AT C AAG AAC AAT T T C AAC AT C AAAAGT T AGC C C AACT GG 
AAAATATTGAACCTAGTGCGCTGACTAAACGCCTTAAAGCAAGCGGTCTG 
AAG ATT TACT T AAG AAC GAG AAC AC AG G C AG C CG AT C T AT T AGT T AAAAG 
TTGCACTCAAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7306 

STRAIN M732 

T CT GCT AT AAT AGAC AAAAAGGT GGTG AT AT T 

TATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACTTG 
AAC GT G AAACT T T C C AAC AGT C T T T T C AG C AACT AAT G AC C G AAC T AT CT 
GATGTATATGGTGAAGAGCTGATTTCTCCATTCACTATTACAGCTGGTGA 
TGAATTTCAAGCTTTATTGAAACaATCAAAAAAGGTATTTCAAATTATTG 
ACCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGT 
AC AGG AAAC AT TAT AAC AT C CAT C AAT T C AAAT G AAAGT AT C G GT G C T G A 
TGGTCCTGC C T ACT G G CAT G C T C G C T C AG C TAT T AAT CAT AT AC AT GAT A 
AAAAT GAT TAT G G AAC AG T T C AAG T AG CT AT T T GC C T T GAT GAT G AAG AC 
CAAAACCTTGAATTAACACTAAATAGTCTCATTTCAGCTGGTGATTTTAT 
C AAGT C AAAAT GGACT AC AAACCATTTTC AAAT GCT TG AGC ACT T AAT AC 
T T C AAG AT AAT TAT C AAGAACAATTT CAACAT C AAAAGT T AGCC CAACTG 
G AAAAT AT T G AAC CT AGT GCGC T G ACT AAAC G C C T T AAAG C AAG C G G T C T 
G AAGAT T T ACTT AAGAACGAGAACACAGGCAGCCG AT CT AT T AGTT AAAA 
GTTGCACTC AAACT AAAGGGGGAAGCTATG AT TTC 

SEQ ID NO. 7307 

STRAIN COH1 

T C T GCT AT AAT AGAC AAAAAGGT GGT GAT AT T 

T AT GT AT T T AGC AT T AAT CGGT GAT AT CAT T AAT T C AAAA C AG AT AC T T G 
AAC G T G AAAC T T T C C AAC AGT C T T T T C AG C AACT AAT G AC C G AACT AT C T 
GATGTATATGGTGAAGAGCTGATTTCTCCATTCACTATTACAGCTGGTGA 
T G AAT T T C AAG CT T T AT T G AAAC a AT C AAAAAAG GT AT T T C AAAT TAT T G 
ACCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGT 
AC AG G AAAC AT TAT AAC AT C CAT C AAT T C AAAT G AAAGT AT CGGT G CT GA 
TGGTCCTGCCTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATA 
AAAAT GAT T AT G G AAC AG T T C AAGT AG C T AT T T G C CT T GAT GAT GAAG AC 
CAAAACCTTGAATTAACACTAAATAGTCTCATTTCAGCTGGTGATTTTAT 
C AAGT C AAAAT GG AC T AC AAAC CAT T T T C AAAT G C T T GAG C ACT T AAT AC 
T T C AAG AT AAT TAT C AAG AAC AAT T T CAACAT C AAAAG T T AG C C C AAC T G 
G AAAAT AT T G AAC C T AGT G C G C T G AC T AAAC G C CT T AAAG C AAG C GG T C T 
GAAG AT T T AC T T AAG AAC GAG AAC AC AGG C AG C C GAT C T AT T AGT T AAAA 
GTTGCACTCAAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7308 

STRAIN M7 81 

TCTGCTATAATAGACAAAAAGGTGGTGATATTT 

ATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACTTGA 
AC GT G AAAC T T T C C AAC AG T C T T T T C AG C AAC T AAT G AC C G AAC TAT CT G 
AT G TAT AT GGT GAAGAG C T GAT T T C T C C AT T C ACT ATT AC AG CT GGT GAT 
GAATTTCAAGCTTTATTGAAACAATCAAAAAAGGTATTTCAAATTATTGA 
CCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGTA 
C AGG AAAC AT TAT AAC AT C CAT C AAT T C AAAT G AAAGT AT CGGT G CT G AT 
GGTCCTGCCTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATAA 
AAAT GAT TAT GGAAC AGT T C AAG T AG C T AT T T G C C T T GAT GAT GAAG AC C 
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AAAACCTTGAATTAACACTAAATAGTCTCATTTCAGCTGGTGATTTTATC 
AAG T C AAAAT G GAC T AC AAAC CAT T T T C AAAT G C T T G AGC ACT T AAT AC T 
T CAAGAT AAT TAT CAAGAAC AAT T T CAACAT CAAAAGTTAGC CCAACT GG 
AAAATATTGAACCTAGTGCGCTGACTAAACGCCTTAAAGCAAGCGGTCTG 
AAG AT T T AC T T AAG AAC G AGAAC AC AGG C AG C CGAT CT AT T AGT T AAAAG 
T T G C ACT C AAACT AAAG GG G G AAG C T AT GAT T T C 

SEQ ID NO. 7309 

STRAIN CJB110 

T CT GCT AT AATAGAC AAAAAGGT GGT GGTA 

TTTATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACT 
T G AAC GT G AAACT T T C C AAC AGT CT T T T C AG C AAC T AAT GAC C G AAC TAT 
CTGATGTATATGGTGAAGAGCTGATTTCTCTATTCACTATTACAGCTGGT 
GAT GAATTT C AAGCTT T AT TGAAACC AT CAAAAAAGGT ATTT CAAATT AT 
TGACCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCG 
GT AC AGGAAACAT T AT AACAT C CAT CAATT CAAAT GAAAGT AT CGGTGC T 
GAT GGTCCTGC C T AC T GG C AT G C T C G C T C AGC TAT T AAT CAT AT AC AT G A 
T AAAAAT GAT TAT GG AAC AG T T C AAG T AG CT AT T T G C C T T GAT GAT GAAG 
ACCAAAACCTTGAATTAACACTAAATAGTCTCATTTCAGCTGGTGATTTT 
AT C AAGT C AAAAT G GAC TACT AAC CAT T T T CAAAT GCT TG AG C ACT T AAT 
ACT T CAAGAT AAT TAT CAAGAAC AAT T T CAACAT C AAAAGT T AGC C C AAC 
TGGAAAATATTGAACCTAGTGCGCTGACTAAACGCCTTAAAGCAAGCGGT 
CTGAAGATTTACTTAAGAACGAGAACACAGGCAGCCGATCTATTAGTTAA 
AAGT TGCACTC AAACT AAAGGGGGAAGCTATG ATTT c 

SEQ ID NO. 7310 

STRAIN JM9130013 

T C T GC TAT AAT AG AC AAAAAGGT GGT GAT AT T T 

ATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACTTGA 
AC GT G AAAC T T T C C AAC AGT C T T T T C AG C AAC T AAT GAC C G AAC TAT C T G 
ATGTATATGGTGAAGAGCTGATTTCTCCATTCACTATTACAGCTGGTGAT 
G AAT T T C AAG CT T T AT T GAAAC CAT C AAAAAAG G T AT T T CAAAT TAT T G A 
CCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGTA 
C AG GAAAC AT TAT AAC AT CC AT C AAT T CAAAT GAAAGT AT C G G T G CT G AT 
GGTCCTGC C T AC T G G CAT G C T CG CT C AG C T AT T AAT CAT AT AC AT G AT AA 
AAAT GAT T AT GG AAC AGT T C AAG TAG C T AT T T G C C T T GAT GAT GAAG AC C 
AAAACC T T G AAT T AAC ACT AAAT AGT C T CAT T T C AG CT G G T GAT T T TAT C 
AAGT CAAAATGGACT ACAAAC CATTTT CAAAT GCT TGAGCACTT AAT ACT 
T CAAGAT AAT TATCAAGAACAAT TT CAACAT CAAAAGTTAGCCC AACT GG 
AAAATATTGAACCTAGTGCGCTGACTAAACGCCTTAAAGCAAGCGGTCTG 
AAG AT T T AC T T AAG AAC GAG AAC AC AG G C AG C C GAT C T AT T AGT T AAAAG 
TTGCACTCAAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7311 
STRAIN 2603 frame: 1 

LSAIIDKKVVIFMYLALIGDIINSKQILERETFQQSFQQLMTELSDVYGEELISPFTITA 
GDEFQALLKPSKKVFQIIDHIQLALKPVNVRFGLGTGNIITSINSNESIGADGPAYWHAR 
SAINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSPCWTTNHFQMLEHLILQ 
DNYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7312 

STRAIN 090 frame: 1 

SAIIDKKWIFMYLALIGDIINSKQILERETFQQSFQQLMTELSDVYGEELISPFTITAG 
DEFQALLKPSKKVFQIIDHIQLALKPVNVRFGLGTGNIITSINLNESIGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7313 

STRAIN A90 9 frame: 1 

SAIIDKKWIFMYLALIGDIINSKQILERETFQQSFQQLMTELSDVYGEELISPFTITAG 
DE FQALLKPSKKVFQI I DHIQLALKPVNVRFGLGTGNI ITS INSNES IGADGPAYWHARS 
AINHIHDBCNDYGTVQVAICLDDEDQNLELTLNSLISAG.DFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 
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SEQ ID NO. 7314 

STRAIN H36B frame: 1 

SAIIDKKWIFMYLALIGDIINSKQILERETFQQSFQQLMTELSDVYGEELISPFTITAG 
DEFQALLKPSKKVFQIIDHIQLALKPVNVRFGLGTGNIITSINSNESIGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7315 

STRAIN 18RS21 frame: 1 

SAIIDKKWIFMYLALIGDIINSKQILERETFQQSFQQLMTELSDVYGEELISPFTITAG 
DE FQALLKPSKKVFQI I DHIQLALKPVNVRFGLGTGNI ITS INSNE S IGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7316 

STRAIN M732 frame: 1 

S AI IDKKWI FMYLALIGDI INSKQILERET FQQS FQQLMTELS DVYGEELI S PFTITAG 
DEFQALLKQSKKVFQI I DHIQLALKPVNVRFGLGTGNI ITS INSNES IGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7317 

STRAIN COH1 frame: 1 

SAI IDKKWI FMYLALIGDI INSKQILERET FQQS FQQLMTELS DVYGEELI S PFTITAG 
DE FQALLKQSKKVFQII DHI QLALKPVNVRFGLGTGN I ITS INSNES IGADG PAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7318 

STRAIN M7 81 frame: 1 

SAI IDKKWI FMYLALIGDI INSKQILERET FQQS FQQLMTELS DVYGEELI S PFTITAG 
DEFQALLKQSKKVFQI I DHIQLALKPVNVRFGLGTGNI ITS INSNES IGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7319 

STRAIN CJB110 frame: 1 

SAI IDKKVWFMYLALIGDIINSKQILERETFQQS FQQLMTELS DVYGEELI SLFTITAG 
DEFQALLKPSKKVFQII DHIQLALKPVNVRFGLGTGNI ITS INSNESIGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7320 

STRAIN JM9130013 frame: 1 

SAI IDKKWI FMYLALIGDI INSKQILERETFQQS FQQLMTELS DVYGEELIS PFTITAG 
DEFQALLKPSKKVFQII DHIQLALKPVNVRFGLGTGNI ITS INSNES IGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7401 
STRAIN 2603 

AT GGAAATGC AAGT T C AAAAAAGT TTT AAAT CAAAT AT AC ATT AC GGAAC ACT CT AT 
C T AGT C C C AACT C C AAT T GGT AAT C TAG AT GAT AT G ACT T TT C GT G CC AT TAG GAT T T T A 
AG AG AAGT T GAT T T T AT T T GT G C AG AGG AT AC AC G AAAT ACGG G AC T T T T AC T C AAG C AC 
TTT GAT AT T AC TACT AAAC AAAT T AGT TTT C AC G AAC AC AAT G CT T AC G AT AAAAT C T CT 
G GG T T AAT T GAT T T GT TAAAAG AAG GG AAAT C T T T AG C C C AAG TAT C T GAT GC AGG AAT G 
CCCTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCTGCTATTGAAGGGGATATCCCA 
GTTGTATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTCATCGCTTCAGGTTTAGCT 
CCACAACCTCATATTTTTTATGGCTTCTTACCTCGTAAGAAAGGTCAACAAATAACTTTC 
TTT GAAACAAAGC AAG AT T AC C CTGAAACACAAAT CTT T T AT GAGT C ACCGTT T CGAGT C 
T CT GAT AC G CT AAAAC AC AT G AAAG AG AT T T AC G G AGAT C G C C AAGT T GT T TT AGT AC G C 
G AAT T G AC G AAAC T CT AT G AAG AG TAT C AAAG AGG AAC C AT T AGT C AACT T TT AG AG C AT 
AT T G AAAAGGT CC CT C T C AAAG GT G AAT G CT T AAT TAT T G T T GAT G G T AAG AG AG AT AC C 
GAG C GAG T G AAAG AC AG T AGC C AAC AAG AT C C AC T AGT AT T AGT AAAAG AAT AT AT C G CT 
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AAT G GT GAT AAAAC T AAT C AAG C GAT AAAAAAAGT AG C AAAAG AAT T T AAT C T C AAT AG A 
C AAG AAC T C T AT G CT AGT T T C C AT GAT T T A 

SEQ ID NO. 7402 
STRAIN 0 90 

GAAATGCAAGTTCAAAAAAGTTTTAAATCAAATACACATTACGGGACACT 
CTAT CT AGTCCCAACTCCAATTGGT AAT CT AGATGAT ATGACTTTT CGT G 
C CAT TAG GAT T T T AAGAGAAGT T GAT T T TAT T T GT G C AG AGG AT AC ACG A 
AATACGGGACTTTTACTCAAGCACTTTGATATTACTACTAAACAAATTAG 
TTTTCACGAACACAATGCTTACGATAAAATCTCTGGGTTAATTGATTTGT 
T AAAAGAAGG G AG AT C T T T AGC C C AAGT AT C T GAT G C AG G AAT G C CC T C T 
AT T T C T GAC C C AG g AC AT G AC C T T GT C AAGG C T G CTAT T G AAG GGGG GAT 
CCCGGTCGTATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTCATCG 
CTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTTCTTACCGCGT 
AAG AAAGGT C AAC AAAT AACT T T T T T T G AAAC AAAGAAAGAT T AC C C T G a 
AACACAAATCTTTTATGAGTCACCGtTTCGAGTCTcTGATACGCTAAAAC 
ACATGAAAGAGATTTACGGAGATCGCCAAGTTGTTTTAGTACGCGAATTG 
AC G AAa C T C T AT G AAG AGT AT C AAAG AGG AAC CAT T AGT C AAC T T T TAG G 
G CAT AT T G AAAAAG T C CCT CT C AAAGG T G AAT G CT T AAT T AT T GT T GAT G 
GTAAGAGAGATACCGAGCGAGTGAAAGACAGTAGCCAACAAGATCCACTA 
GT AT T AGT AA 

SEQ ID NO. 7403 

STRAIN A909 

AGT T C AAAAAAG T TT T AAAT C AAAT AT AC AT T AC G G AAC ACT CTAT C T AG 
TCCCAACTCCAATTGGTAATCTAGATGATATGACTTTTCGTGCCATTAGG 
AT T T T AAG AG AAGT T GAT T T T AT T T GT G C AG AG GAT AC AC G AAAT AC G G G 
AC T T T T AC T C AAGC ACT T T GAT AT TAG T AC T AAAC AAAT T AGT T T T C AC G 
AACACAATGCTTACGATAAAATCTCTGGGTTAATTGATTTGTTAAAAGAA 
GGGAAATCTTTAGCCCAAGTATCTGATGCAGGAATGCCCTCTATTTCTGA 
CCCAGGACATGACCTTGTCAAGGCTGCTATTGAAGGGGATATCCCAGTTG 
TATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTCATCGCTTCAGGT 
TTAGCTCCACAACCTCATATTTTTTATGGCTTCTTACCACGTAAGAAAGG 
T C AAC AAAT AACT T T C T T T g AAAC AAAG C AAG AT T ACC CT G AAAC AC AAA 
TCTTTTATGAGTCACCGTTTCGAGTCTCtGATACGCTAAAACACATGAAA 
GAG AT T T ACGG AG AT C G C C AAGT T GT T T T AGT ACG C G AAT T G ACG AAAC T 
CTAT GAAG AGT AT C AAAGAG GAAC CAT T AGT C AAC T T T TAG AG CAT AT T G 
AAAAGGTCCCTCTCAAAGGTGAATGCTTAATTATTGTTGATGGTAAGAGA 
GAT AC CG AG C GAG T G AAAG AC AG TAG C C AAC AAG AT CC ACT AGT AT T AGT 
AA 

SEQ ID NO. 7404 

STRAIN H3 6B 

GAAAT GC AAGT T C AAAAAAGTT T T AAAT C AAAT AC AC AT T 
ACGGGACACT CT AT CT AGT CCC AACT CCAATTGGT AAT CT AGATGAT AT G 
ACTTTTCGTGCCATTAGGATTTTAAGAgAAGTTGATTTTATTTGTGCAGA 
GGATACACGAAATACGGGACTTTTACTCAAGCACTTTGATATTACTACTA 
AAC AAAT T AGT T T T C AC GAAC AC AAT G C T T AT GAT AAAAT CTCTGGGTTA 
AT T GAT T T G T T AAAAG AAG G GAG AT C T T TAG C C C AAGT AT C T G AT GC AGG 
AATGCCCTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCTGCTATTG 
AAGGGGATATCCCGGTCGTATCTATACCAGGAGCTAGCGCTGGTATTACT 
GCTCTCATCGCTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTT 
C T T AC C G CG T AAG C AAG GT C AAC AAAT AAC T T T T T T T G AAAC AAAG AAAG 
ATTACCCTGAAACACAAATCTTTTATGAGTCACCGtTTCGAGTCTCTGAT 
ACG CT AAAAC AC AT G AAAG AG AT T T AT G GAG AT C GC CAAGT T G T T TT AG T 
AC G C G AAT T G ACG AAAC T C TAT GAAG AGT AT C AAAG AGG AAC CAT TAG T C 
AACTTTTAGGGCATATTGAAAAGGTCCCTCTCAAAGGTGAATGCTTAATT 
ATTGTTGATGGTAAGAGAGATACTGAGCGAGTGAAAGACAGTAGCCAACA 
AGATCCACTAGTATTAGTAA 

SEQ ID NO. 7405 

STRAIN 18RS21 

GAAATGCAAGT T CAAAAAAGT TT T AAAT C AAAT AT ACAT T 
ACGGAACACTCTATCTAGTCCCAACTCCAATTGGTAATCTAgATGATATG 
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AC TTTtCGTGC CAT T AGG AT T T T AAG AGAAGT T GAT T T TAT T T GT G C AGA 
Gg AT AC AC G AAAT AC G G G AC TT T T ACT C AAG C AC T T T GAT AT T AC TACT A 
AAC AAAT T AGT T T T C AC GAAC AC AAT G CT T AC G AT AAAAT CTCTGGGTTA 
AT T GAT T T G T T AAAAGAAGGG AAAT C T T T AG C C C AAG TAT CT G AT G C AGG 
AATGCCCTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCTGCTATTG 
AAG GGG AT AT C CC AGT T GT AT C TAT AC C AGG AG C T AG C G C T G GT AT TAG T 
GCTCTCATCGCTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTT 
C T T AC C ACGT AAGAAAGGT C AAC AAAT AAC T T T C t T T G AAAC AAAGC AAG 
AT T AC CC T G AAAC AC AAAT CT T T TAT G AGT C AC CG t T T C G AGT C T C T GAT 
ACG CT AAAAC AC AT G AAAG AG AT T T AC G GAG AT CG C C AAGT T G T T T T AGT 
ACGCGAATTGACGAAACT CTAT GAAGAGTAT CAAAGAGGAAC C ATT AGT C 
AACTTTTAGAGCATATTGAAAAGGTCCCTCTCAAAGGTGAATGCTTAATT 
AT T GT T G AT GGT AAG AG AG AT AC C GAG C GAGTG AAAG AC AGT AGC C AAC A 
AGATCCACTAGTATTAGTAA 

SEQ ID NO. 7406 

STRAIN M732 

G AAAT GC AAGT T C AAAAAAGT T T T AAAT C AAAT 

ATACATTACGGAACACTCTATCTAGTCCCAACTCCAATTGGTAATCTAGA 
T GAT AT G AC TTTTCGTGC C AT T AGGAT T T T AAG AG AAGT T GAT T T T AT T T 
GTGCAGAGGATACACGAAATACGGGACTTTTACTCAAGCACTTTGATATT 
ACTACTAAACAAAT TAGTT T T CACGAACAC AAT GCTTACGAT AAAAT CTC 
TGGGTTAATTGATTTGTT AAAAGAAGGG AAATCTTTAGCCCAAGTATCTG 
ATGCAGGAATGCCCTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCT 
G C TAT T GAAGGGG AT AT C C C AGT T G TAT CT AT AC C AG G AG CT AG C G C T G G 
TATTACTGCTCTCATCGCTTCAGGTTTAGCTCCACAACCTCATATTTTTT 
ATGGCTTCTTACCACGTAAGAAAGGTCAACAAATAACTTTCTTTGAAACA 
AAG C AAGAT T AC C C T GAAAC AC AAAT C T T T TAT G AGT C AC C G t T T C GAG T 
CT C T GAT AC G C T AAAAC AC AT GAAAG AG AT T T AC G GAG AT C G C C AAGT T G 
TTTTAGTACGCGAATTGACGAAACTCTATGAAGAGTATCAAAGAGGAACC 
AT T AGTCAACT T T T AGAGC AT ATT GAAAAGGT C CCT CT C AAAGGTGAATG 
CT T AAT TAT T GT T GAT GGT AAGAGAG AT AC C GAG C GAG T GAAAG AC AGT A 
GCCAACAAGATCCACTAGTATTAGTAA 

SEQ ID NO. 7407 

STRAIN COH1 

GAAAT GCAAGT T C AAAAAAGT T TT a AAT C AAAT AT AC ATT AC 
G GAAC AC T CTAT CT AG T C C C AACT C C AAT T GG T AAT CT AG AT GAT AT G AC 
TT T T CGTGCC ATT AGG AT TTT AAG AGAAGT TGATTTTATTTGTGC AG AGG 
AT AC AC GAAAT AC G G GAc T T T T AC T C AAGC AC T T T GAT AT T AC TACT AAA 
C AAAT T AGT TTT CACGAACAC AAT GCTTACGAT AAAAT CTCTGGGTT AAT 
T GAT T T GT T AAAAGAAG GG AAAT C T T T AGC C C AAGT AT C T GAT G C AG GAA 
TGCCCTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCTGCTATTGAA 
GGGGATATCCCAGTTGTATCTATACCAGGAGCTAGCGCTGGTATTACTGC 
TCTCATCGCTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTTCT 
T AC C AC GT AAGAAAGG T C AAC AAAT AAC T T T C T T T GAAAC AAAG C AAGAT 
TACCCTGAAACACAAATCTTTTATGAGTCACCGtTTCGAGTCTCTGATAC 
G C T AAAAC AC AT GAAAG AG AT T T AC G GAG AT C G C C AAG T T G T TT T AG T AC 
GCGAATT GACGAAACT CTAT GAAGAGTAT CAAAGAGGAACC ATT AGT CAA 
CTTT T AGAGCAT AT T GAAAAGGT C CCT CTCAAAGGTGAATGCTT AAT TAT 
T GT T GAT G G T AAG AG AG AT AC C GAG C G AGT GAAAG AC AGT AG C C AAC AAG 
ATCCACTAGTATTAGTAA 

SEQ ID NO. 7408 

STRAIN M7 81 

AAAT G C AAG T T C AAAAAAG T T T T AAAT C AAAT AT AC AT T AC G GAAC AC T C 
TATCTAGTCCCAACTCCAATTGGTAATCTAGATGATATGACTTTTCGTGC 
CAT TAG GAT T T T AAG AG AAG T T GAT T T TAT T T GT G C AG AGG AT AC AC GAA 
ATACGGgACTTTTACTCAAGCACTTTGATATTACTACTAAACAAATTAGT 
TTTCACGAACACAATGCTTACGATAAAATCTCTGGGTTAATTGATTTGTT 
AAAAGAAG GG AAAT C T T TAG C C C AAGT AT C T GAT G C AG GAAT G C C C T c T A 
TTTCTGACCCAGGACATGACCTTGTCAAGGCTGCTATTGAAGGGGATATC 
CCAGTTGTATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTCATCGC 
TTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTTCTTACCACGTA 
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AGAAAGGTCAACAAATAACTTTCTTTGAAACAAAGCAAGATTACCCTGAA 
AC AC AAAT C T T T TAT GAGT C AC C GT T T CG AGT cT c T GAT AC G C T AAAAC A 

CATGAAAGAGATTT ACGGAGAT CGC CAAGT TGTTTTAGT ACGCGAAT TGA 
CG AAAC T C TAT G AAG AGT AT C AAAGAG GAAC CAT T AGT C AAC T T T T AGAG 
C AT ATT G AAAAGGT C C C T C T C AAAGGT G AAT G CT T AAT TAT T GT T GAT G G 
T AAGAGAGAT AC C GAG C G AGT GAAAG AC AGT AGC C AAC AAGAT C C AC TAG 
TATTAGTAA 
A 

SEQ ID NO. 7409 

STRAIN CJB110 

G AAAT GC AAGT T C AAAAAAGT T T T AAAT C AAAT AC AC AT T AC GGG AC AC 
T CT AT C T AGT C C C AACT C C AAT T G G T AAT C T AGAT GAT AT G ACT T T T C GT 
GC CAT T AGG AT T T T AAG AG AAGT T GAT T T TAT T T GT G C AG AG G AT AC ACG 
AAAT AC G G G AC T T T TACT C AAG C AC T TT G AT AT T AC T AC T AAAC AAAT T A 

GTTTTCACGAACACAATGCTTACGATAAAATCTCTGGGTTAATTGATTTG 
T T AAAAGAAG GG AG AT C T T T AGC CC AAG TAT C T GAT G C AGG AAT GC C C T C 
TAT T T CT G AC C C AG G AC AT GAC C T T GT C AAG G C T GC TAT T G AAGGGG GG A 
T CC CGGT C GT AT C TAT AC C AGG AG C TAG C G CT G GT AT T AC T G C T C T CAT C 
GCTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTTCTTACCGCG 
T AAG AAAG GT C AAC AAAT AACT T T t T T T GAAAC AAAGAAAGAT T AC C CT G 
AAACACAAATCTtTTATGAGTCACCGtTTcGAGTCTCTGATACGCTAAAA 
C AC AT GAAAG AG AT T T ACG GAG ATC GC C AAGT T G T T T T AGT AC G C G AAT T 
GACG AAAC T C TAT G AAGAG T AT C AAAG AG GAAC CAT T AGT C AACT T T T AG 
GGCATATTGAAAAAGTCCCTCTCAAAGGTGAATGCTTAATTATTGTTGAT 
GGT AAGAG AGAT AC C GAG C GAGT GAAAG AC AGT AGC C AAC AAG AT C C AC T 
AGTATTAGTAA 

SEQ ID NO. 7410 

STRAIN 1169NT 

T G C AAGT T C AAAAAAGT T T T AAAT C AAAT AC AC AT TAT G GG AC AC T C T AT 
CTAGTCCCAACTCCAATTGGTAATCTAGATGATATGACTTTTCGTGCCAT 
TAG GAT T TT AAG Ag AAGT T G a T T T T AT T T G T G C AG AGG AT AC AC G AAAT A 
C GG G ACT T T T AC T C AAG C AC T T T GAT a T T AC T AC T AAAC AAAT T AG t T T T 
c ACG AAC AC AAT GC T T AC G AT AAAAT C T C T GGG T T AAT T GAT T t GT T AAA 
AGAAGGGAAAT CTTTAGCC CAAGT ATCT GAT GC AGG AATGCC CT CT AT T T 
C T G ACC C AGG AC AT GAC C T T GT C AAG G C T G C TAT T GAAG GGG AT AT C C C A 
GT T GT AT CT AT AC C AGG AG C T AGCG C T GGT AT T AC T G C T C T CAT CG CT T C 
AGG T T TAG C T C C AC AAC C T CAT AT T T T T T ATG G C T T C T T AC C AC GT AAG A 
AAGGT C AAC AAAT AACT T T T T T T GAAAC AAAG C AAGAT TAT C C T GAAAC A 
CAAATCTTTTATGAGTCACCGtTTCGAGTCTCTGATACGCTAAAACACAT 
GAAAGAGATTTACGGAGATCGCCAAGTTGTTTTAGTACGCGAATTGACgA 
AAC T C T AT GAAG AG TAT C AAAG AGG AAC CAT T a G T C AACT T T TAG AG CAT 

ATTG AAAAGGT CCCTCTCAAAGGTGAATGCTTAATTATTGtTGAT GGT AA 
GAG AG At a C C GAG C GAG T GAAAG AC AG TAG C C AAC AAG AT C C ACT AG TAT 
TAGTAA 

SEQ ID NO. 7411 

STRAIN JM9130013 

G AAAT G CAAGT T C AAAAAAG T T T T AAAT C AAAT AC AC AT T AC G GG A 

CACTCTATCTAGTCCCAACTCCAATTGGTAATCTAgATGATATGACTTTT 
C GTG C C AT T AGG AT T T T AAG AG AAGT T GAT T T TAT T T G T G C AG AGG AT AC 

ACGAAATACGGGACTTTTACTCAAGCACTTTGATATTACTACTAAACAAA 
TTAGTTTTCACGAACACAATGCTTATGATAAAATCTCTGGGTTAATTGAT 
TTGTTAAAAGAAGGGAGATCTTTAGCCCAAGTATCTGATGCAGGAATGCC 
CTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCtGCTATTGAAGGGG 
ATATCCCGGTCGTATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTC 
ATCGCTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTTCTTACC 
GCGTAAGCAAGGT CAACAAAT AAC t T T TT T TGAAAC AAAGAAAGAT T AC C 
CTGAAACACAAATCTTTTATGAGTCACCGTTTCGAGTCTCTGATACGCTA 
AAAC AC AT GAAAG AG AT T TAT G GAG AT CGC CAAGT T G T T T T AG T AC G C G A 
AT T GAC GAAAC T C TAT GAAG AG TAT C AAa G AGG AAC CAT T AGT C AAC T T T 
T AGGGC AT AT TG a AAAGGT CCCTCTC AAAGGT G AAT GCTT AAT TATTGTT 
GAT G G T AAG AG AG AT AC T GAG C GAG T GAAAG AC AGT AG C C AAC AAG AT C C 
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AGTAGTATTAGTAA 

SEQ ID NO. 7412 
STRAIN 2603 frame: 1 

MEMQVQKSFKSNIHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHF 
DITTKQI S FHEHNAYDKISGLI DLLKEGKSLAQVS DAGMPS I SDPGHDLVKAAIEGDI PV 
VSIPGASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKQDYPETQIFYESPFRVS 
DTLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTE 
RVKDSSQQDPLVLVKEYIANGDKTNQAIKKVAKEFNLNRQELYASFHDL 

SEQ ID NO. 7413 

STRAIN 090 frame: 1 

EMQVQKS FKSNTHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQISFHEHNAYDKISGLIDLLKEGRSLAQVSDAGMPSISDPGHDLVKAAIEGGIPW 
SIPGASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKKDYPETQIFYESPFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLGHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7414 

STRAIN A909 frame: 2 

VQKSFKSNIHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFDITT 
KQI S FHEHNAYDKISGLI DLLKEGKSLAQVS DAGMPS I S DPGHDLVKAAIEGDI PWS IP 
GASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKQDYPETQIFYESPFRVSDTLK 
HMKEIYGDRQVVLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTERVKD 
SSQQDPLVLV 

SEQ ID NO. 7415 

STRAIN H36B frame: 1 

EMQVQKS FKSNTHYGTLYLVPT PI GNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQISFHEHNAYDKISGLIDLLKEGRSLAQVS DAGMPS IS DPGHDLVKAAIEGDI PW 
SIPGASAGITALIASGLAPQPHIFYGFLPRKQGQQITFFETKKDYPETQIFYESPFRVSD 

TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLGHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7416 

STRAIN 18RS21 frame: 1 

EMQVQKS FKSNIHYGTLYLVPTPIGNLDDMTFRAIRILREVDFI CAE DTRNTGLLLKHFD 
ITTKQIS FHEHNAYDKISGLI DLLKEGKSLAQVS DAGMPS I SDPGHDLVKAAIEGDI PVV 
SIPGASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKQDYPETQIFYESPFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7417 

STRAIN M732 frame: 1 

EMQVQKS FKSNIHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQI S FHEHNAYDKI SGLI DLLKEGKSLAQVS DAGMPS I S DPGHDLVKAAIEGDI PVV 
SIPGASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKQDYPETQIFYESPFRVSD 

TLKHMKEIYGDRQVVLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7418 

STRAIN COH1 frame: 1 

EMQVQKS FKSNIHYGTLYLVPTPIGNLDDMT FRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQI S FHEHNAYDKI SGLI DLLKEGKSLAQVS DAGMPS I S DPGHDLVKAAIEGDI PW 
SIPGASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKQDYPETQIFYESPFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7419 

STRAIN M7 81 frame: 3 

MQVQKS FKSNIHYGTLYLVPTPIGNLDDMT FRAIRILREVDFICAEDTRNTGLLLKHFDI 
TTKQIS FHEHNAYDKI SGLI DLLKEGKSLAQVSDAGMPS I SDPGHDLVKAAIEGDI PWS 
I PGAS AGI TALI ASGLAPQPHI FYGFLPRKKGQQITFFETKQDYPETQI FYE S P FRVS DT 
LKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTERV 
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KDSSQQDPLVLV 

SEQ ID NO. 7420 

STRAIN CJB110 frame: 1 

EMQVQKSFKSNTHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQISFHEHNAYDKISGLIDLLKEGRSLAQVSDAGMPSISDPGHDLVKAAIEGGIPVV 
SIPGASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKKDYPETQIFYESPFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLGHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7421 

STRAIN 1169NT frame: 3 

QVQKSFKSNTHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFDIT 
T KQ I S FHE HN A YD K I S G L I DL LKEGK S L AQ V S D AGM PSISDPGHD L VKAA I E G D I PW S I 
PGASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKQDYPETQIFYESPFRVSDTL 
KHMKEIYGDRQVVLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTERVK 
DSSQQDPLVLV 

SEQ ID NO. 7422 

STRAIN JM9130013 frame: 1 

EMQVQKS FKSNTHYGT LYLVPT P I GNLDDMT FRAIRI LRE VD FI CAE DTRNTGLLLKHFD 
I T T KQ I S FHE HN AY DKISGLIDL LKE GRS L AQ VS D AGM PSISDPGHD L VKAA I E G D I PW 
SIPGASAGITALIASGLAPQPHIFYGFLPRKQGQQITFFETKKDYPETQIFYESPFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLGHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPWLV 

SEQ ID NO. 7501 
STRAIN 2603 

ATGAGCGTATATGTTAGTGGAATAGGAATTATT 

T C T T C T T T GG G AAAGAAT T AT AG CG AG C AT AAAC AGC AT C T C T T C GAC T T AAAAG AAG G A 
AT T T C T AAAC AT T TAT AT AAAAAT C ACG AC T CT AT T T T AG AAT C T TAT AC AG G AAG CAT A 
ACTAGTGACCCAGAGGTTCCTGAGCAATACAAAGATGAGACACGTAATTTTAAATTTGCT 
TTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTTAAAAGCTTATCATAAT 
ATTGCTGTGTGTTTAGGGACCTCACTTGGGGGAAAGAGTGCTGGTCAAAATGCCTTGTAT 
CAATTTGAAGAAGGAGAGCGTCAAGTAGATGCTAGTTTATTAGAAAAAGCATCTGTTTAC 
CATATTGCTGATGAATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCA 
AC CGCCTGTTCTG C AAG T AAT AAT G CCGT AAT AT T AGG AAC AC AAT TACT T C AAG AT G G C 
GAT T GT G AT T T AGCT AT TTGTGGTGGCTGT GAT G AGT T AAGT GAT AT T T C T T TAG C AG GC 
T T C AC AT C AC T AGG AGCT AT T AAT AC AGAAAT GG C AT GT C AG C C C T AT T C T T C T GG AAAA 
GGAATCAATTTGGGTGAGGGCGCTGGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCT 
AAAT AT G G AAAAAT TAT CGGTGGTCT TAT T AC T T C AGAT G GT T AT CAT AT AAC AG C AC C T 
AAG C C AAC AG G T G AAGGG G CG GC AC AG AT T GC AAAG C AG C T AGT GAC T C AAG C AGGT AT T 
GACTACAGTGAGATTGACTATATTAACGGTCACGGTACAGGTACTCAAGCTAATGATAAA 
AT GG AAAAAAAT AT GT AT GG T AAGT TT T T C C C GAC AACG AC AT T GAT C AG C AGT AC C AAG 
G GG C AAAC G GGT CAT AC T C T AGG GG C T G C AG G TAT TAT C GAAT T GAT T AAT T GT T T AGC G 
GCAATAGAGGAACAGACTGTACCAGCAACTAAAAATGAGATTGGGATAGAAGGTTTTCCA 
G AAAAT T T T GT C T AT CAT C AAAAG AGAGAATACC C AAT AAG AAAT G C T T T AAAT TT T T CG 
TTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTTTAGATTCACCTCTAGAA 
ACATTACCTGCTAGAGAAAATCTTAAAATGGCTATCTTATCATCTGTTGCTTCCATTTCT 
AAG AAT GAAT C ACTT T CT AT AAC C T AT G AAAAAG T T G C T AGT AAT T T C AAC G AC T T T G AA 
G C AT T ACG C T T T AAAG G G G CT AGAC C AC CC AAAAC T G T C AAC C C AG C AC AAT T T AGG AAA 
ATGGATGATTTTTCCAAAATGGTTGCCGTAACAACAGCTCAAGCACTAATAGAAAGCAAT 
AT T AAT C T AAAAAAAC AAG AT AC T T C AAAAGT AGG AAT T GT AT T T AC AAC AC T T T C T GG A 
C C AGT T G AGG T T G T T G AAGG T AT T G AAAAG C AAAT C AC AAC AG AAGG AT AT G C AC AT GT T 
TCTGCTTCACGATTCCCGTTTACAGTAATGAATGCAGCAGCTGGTATGCTTTCTATCATT 
TTTAAAATAACAGGTCCTTTATCTGTCATTTCGACAAATAGTGGAGCGCTTGATGGTATA 
C AAT AT GC C AAGG AAAT GAT G CGT AAC GAT AAT C TAG AC TAT G T GAT TCTTGTTTCTGCT 
AATCAGTGGACAGACATGAGTTTTATGTGGTGGCAACAATTAAACTATGATAGTCAAATG 
TTTGTCGGTTCTGATTATTGTTCAGCACAAGTCCTCTCTCGTCAAGCATTGGATAATTCT 
CCTATAATATTAGGTAGTAAACAATTAAAATATAGCCATAAAACATTCACAGATGTGATG 
ACT AT T T T T GAT GCTGCGCTT C AAAAT T TAT TAT C AGAC T TAG G AC T AAC CAT AAAAG AT 
AT C AAAG GT T T CGT T T GG AAT G AG CG G AAG AAGG C AGT T AGT T C AG AT TAT GAT T T CT T A 
GCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGGTCAGTTTGGATTTTCA 
T C T AAT G GT G C T GGT G AAG AAC T G GAC TAT AC T GT T AAT G AAAGT AT AG AAAAGG GC T AT 
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TATTTAGTCCTATCTTATTCGATCTTCGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7502 

STRAIN 090 

AT G T T AGT G GAAT AGG AAT TAT TTCTTCTTT G GG AAAG a AT T AT 

AG C GAG C AT AAAC AG CAT C T C T T CGACT T AAAAG AAG G AAT T T C T AAAC A 

TTTATATAAAAATCACGACTCTATTTTAGAATCTTATACAGGAAGCATAA 

CT AGT GAC C C AG AG G T T C CT GAG C AAT AC AAAG AT G AG AC ACGT AAT T T T 

AAATTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAA 

TTTAAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGG 

GAAAGAGTGCTGGTCAAAATGCCTTGTATCAATTTGAAGAAGGAGAGCGT 

C AAG TAG AT G C T AGT T TAT T AG AAAAAG CAT CT GT T T AC CAT AT T G CT G A 

TGAATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAA 

CCGCCTGTTC T GC AAGT AAT AAT GC CGT AAT AT T AG G AAC AC AAT T AC T T 

CAAGATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAG 

T GAT AT T T CT T TAG C AGG C T T C AC AT C ACT AG GAG C T AT T AAT AC AG AAA 

TGGCATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGC 

GCTGGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAA 

AATTATCGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTA 

AG C C AAC AG G T GAAG G GG CGG C AC AG AT T G C AAAG C AGCT AG T G ACT C AA 

G C AG GT AT T G AC T AC AGT GAG AT T G ACT AT AT T AAC GGT C AC GGT AC AGG 

TACT C AAG C T AAT G AT AAAAT G GAAAAAAAT AT GT AT G G T AAGT T T T T C C 

C GAC AAC GAC AT T GAT C AG C AGT AC C AAGG G G C AAAC GG G T CAT ACT C T A 

GGGGCTGCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAGGA 

AC AG ACT GT AC C AG C AAC T AAAAAT GAG AT T GG G AT AG AAGGTT T T C C AG 

AAAATTTTGTCTATCATCAAAAGAGAGAATACCCAATAAGAAATGCTTTA 

AATTTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTATCTTATTGTCATC 

T T TAG AT T C AC C T C TAG AAA CAT T AC C T G C TAG AG AAAAT C T T AAAAT G G 

CTATCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATA 

AC C TAT GAAAAAGT T G CT AGT AAT T T C AAC GAC T T T GAAG C ATT ACG C T T 

T AAAGG G G C T AGAC C AC C C AAAAC T GT C AAC C C AG C AC AATT T AGG AAAA 

TGGATGATTTTTCCAAAATGGTTGCCGTAACAACAGCTCAAGCACTAATA 

G AAAGC AAT AT T AAT CT AAAAAAAC AAG AT ACT T CAAAAGTAGGAATTGT 

AT T T AC AAC AC T T T C T GG AC C AGT T G AGGT T G T T G AAGG TAT T G AAAAGC 

AAAT C AC AAC AG AAG GAT AT GC AC AT GTTTCTGCTT C AC GAT TCCCGTTT 

ACAGTAATGAATGCAGC AGCT GGT AT GCTTTCT AT CATTTTT AAAAT AAC 

AGGTCCTTTATCTGTCATTTCGACAAATAGTGGAGCGCTTGATGGTATAC 

AAT AT G C C AAGG AAAT GAT G CGT AAC GAT AAT C TAG AC TAT GTG AT T CT T 

GT T T C T G CT AAT C AGT GG AC AG AC AT G AGT T T T AT GT G GT G G C AAC AAT T 

AAACTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACAAG 

TCCTCTCTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTAAA 

C AAT T AAAAT AT AG C CAT AAAAC AT T C AC AG AT GT GAT GAC TAT T T T T G A 

TGCTGCGCT T C AAAAT T TAT TAT C AGACT T AG G AC T AAC CAT AAAAG AT A 

TCAAAGGTTTCGTTTGGAATGAGCGGAAGAAGGCAGTTAGTTCAGATTAT 

GATTTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTC 

TGGTCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATA 

CTGTTAATGAAAGTATAGAAAAGGGCTATTATTTAGTCCTATCTTATTCG 

ATCTTTGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7503 

STRAIN A909 

ATGT T AGTGG AAT AGGAATTATTTCTTCTTTGGG AAAG AATT 

AT AG CGAG CAT AAAC AG CAT C T CT T C G AC T T AAAAG AAG GAAT T T C T AAA 

C AT TT AT AT AAAAAT CACGACT CT AT TTT AGAAT CTT AT AC AGGAAG CAT 

AAC TAG T GAC C C AG AG GT T C C T GAG C AAT AC AAAG AT GAG AC AC GT AAT T 

TTAAATTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTT 

AATTTAAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGG 

GGG AAAGAGT G CT GGT C AAAAT G C CT T GT AT C AAT T T GAAG AAGG AG AG C 

GT C AAG TAG AT G C T AG T T TAT TAG AAAAAG CAT CT GT T T AC CAT AT T G CT 

GATGAATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTC 

AACCGCCTGTTCTGCAAGTAATAATGCCGTAATATTAGGAACACAATTAC 

TTCAAGATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTA 

AGT GAT AT T T C T T T AGC AG G CT T C AC AT C AC TAG GAG C TAT T AAT AC AG A 

AATGGCATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGG 
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GCGCTGGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGA 
AAAAT T AT CGGTGGTC T T AT T AC T T C AG AT G G T TAT C AT AT AAC AG C AC C 
T AAG C CAAC AG GT GAAG GGG CG G C AC AG ATT G C AAAG C AG C T AGT GAC T C 
AAG C AGGT AT T G AC T AC AGT GAG AT T G AC TAT AT T AACGGT C AC GGT AC A 
GGT ACT C AAG C T AAT G AT AAAAT GG AAAAAAAT AT GT AT GG T AAG T T T T T 
C C C GAC AACG AC AT T GAT C AG C AG T AC C AAG G GG C AAACG G GT CAT AC T C 
TAGGGGCTGCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAG 
GAAC AG AC T GT AC C AG C AACT AAAAAT GAG AT T G GG AT AG AAGGT T T T C C 
AGAAAAT T T T GT CT AT CAT C AAAAG AG AG AAT AC C C AAT AAG AAAT G C T T 
TAAATTTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCA 
T CTT T AGAT T CACCT CT AGAAAC AT T AC CTGCT AGAGAAAAT CTT AAAAT 
GGCTATCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTA 
T AAC CT AT G AAAAAG T T GC T AGT AAT T T C AACG AC TT T GAAG CAT T ACGC 
T T T AAAGGGG CT AG AC C AC C C AAAACT G T CAAC C C AG C AC AAT T T AG G AA 
AATGGATGATTTTTCCAAAATGGTTGCCGTAACAACAGCTCAAGCACTAA 
TAGAAAGCAATATTAATCTAAAAAAACAAGATACTTCAAAAGTAGGAATT 
GT AT T T AC AAC ACT T T C T GG AC C AG T T GAG G T T G T T G AAGGT AT T G AAAA 
G C AAAT C AC AAC AG AAGG AT AT G C AC AT GTTTCTGCTT C AC GAT T CC C GT 
TTACAGTAATGAATGCAGCAGCTGGTATGCTTTCTATCATTTTTAAAATA 
AC AG GT C CT T T AT CT GT CAT T T CGAC AAAT AGT GG AG C GC T T GAT G G T AT 
AC AAT AT G C C AAG G AAAT GAT G CGT AAC GAT AAT C TAG AC TAT G T GAT T C 
TTGTTTCT G CT AAT C AG T GGAC AGAC AT GAG T T T TAT GT G GT G G C AAC AA 
TTAAACTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACA 
AGTCCTCTCTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTA 
AAC AAT T AAAAT AT AG C C AT AAAAC AT T C AC AG AT GT G AT GAC TAT T TT T 
GAT GCTGCGCTT C AAAAT T TAT TAT C AG AC T T AGG AC T AAC C AT AAAAG A 
TATCAAAGGTTTCGTTTGGAATGAGCGGAAGAAGGCAGTTAGTTCAGATT 
ATGATTTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCT 
TCTGGTCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTA 
T ACT GT TAAT GAAAGT AT AGAAAAGGGCTAT TATT TAGT CCT AT CT T AT T 
CGATCTTCGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7504 

STRAIN H3 6B 

ATGTTAGTGGAATAGGAATTATTTCTTCTTTGGGAAAGAATTATAGCGA 
GCAT AAACAGC AT CT CTT CGACT T AAAAG AAGGAATT T CT AAACAT TT AT 
AT AAAAAT C ACGAC T C TAT T T TAG AAT C T TAT AC AGG AAGC AT AAC TAGT 
GAC C C AGAGGT T C CT GAG C AAT AC AAAG AT GAG AC AC GT AAT T T T AAAT T 
TGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTTAA 
AAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGAAAG 
AGT G C T G G T C AAAAT G C C T T GT AT C AAT T T GAAG AAGG AG AG CGT C AAGT 
AGAT G C T AG T T TAT T AG AAAAAGC AT CT G T T T AC C AT AT TG CT G AT G AAT 
TGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCGCC 
T GT T C T G C AAGT AAT AAT G C C GT AAT AT TAG GAAC AC AAT T AC T T C AAG A 
TGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGATA 
TTTCTTTAGCAGGCTTCACATCACTAGGAGCTATTAATACAGAAATGGCA 
TGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGCTGG 
TTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAAAATTA 
TCGGTGGTCT TAT T AC T T C AG ATG G T TAT CAT AT AAC AG C AC C T AAG C C A 
AC AG GT G AAGG G G C GG C AC AG ATT G C AAAG C AG C TAGT G ACT C AAG C AG G 
TAT T GAC T AC AGT GAG AT T GAC TAT AT T AAC GGT C AC GGT AC AG GT AC T C 
AAGCTAATGATAAAATGGAAAAAAATATGTATGGTAAGTTTTTCCCGACA 
AC GAC AT T GAT C AG C AGT AC C AAG G GG C AAAC G GGT CAT AC T C TAG GG G C 
TGCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAGGAACAGA 
C T G T AC C AG CAAC T AAAAAT GAG AT T G G GAT AG AAG G T T T T C C AG AAAAT 
T T T G T C T AT CAT C AAAAG AG AG AAT AC C C AAT AAG AAAT G C T T T AAAT T T 
TTCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTTTAG 
AT T C AC C T C TAG AAAC AT T AC C T G C TAG AG AAAAT C T T AAAAT G G C TAT C 
TTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACCTA 
TGAAAAAGTTGCTAGTAATTTCAACGACTTTGAAGCATTACGCTTTAAAG 
G GG C T AG AC C AC C C AAAAC T G T CAAC C C AG C AC AAT T T AGG AAAAT GG AT 
GAT T T T T C C AAAAT G GT T GC C GT AAC AAC AG CT C AAG C AC TAAT AG AAAG 
C AAT AT TAAT C T AAAAAAAC AAG AT AC T T C AAAAGT AG G AAT T G T AT T T A 
CAACACTTTCTGGACCAGTTGAGGTTGTTGAAGGTATTGAAAAGCAAATC 



324 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



AC AAC AG AAG GAT AT G C AC AT GT T T C T G C T T C AC GAT T C C C GT T T AC AGT 
AAT G AAT G C AG C AGC T GGT AT GC T T T C T AT CAT T T T T AAAAT AAC AG GT C 
C T T TAT C T GT CAT T T CG AC AAAT AGT G G AGC G CT T GAT G GT AT AC AAT AT 
G C C AAG GAAAT GAT G C G T AAC GAT AAT CT AGAC T AT GT G AT TCTTGTTTC 
TGCTAATCAGTGGACAGACATGAGTTTTATGTGGTGGCAACAATTAAACT 
AT GAT AGT C AAAT GT T TGTCGGTTCT GAT TAT T GT T C AGC AC AAGT C C T C 
TCTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTAAACAATT 
AAAAT AT AG C C AT AAAAC AT T C AC AG AT GT GAT GACT AT T T T T GAT G C T G 
C GC T T C AAAAT T TAT TAT C AG AC T T AGG AC T AAC C AT AAAAG AT AT C AAA 
GGTTTCGTTTG GAAT GAG CG G AAGAAGG C AGT T AGT T C AGAT TAT GAT T T 
CTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGGTC 
AGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATACTGTT 
AAT G AAAG T AT AG AAAAGGG C TAT TAT T T AG T C CT AT C T T AT T CG AT C T T 
CGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7505 

STRAIN 18RS21 

AT GT T AGT GG AAT AG GAAT TAT TTCTTCTTT G GG AAAGAAT TAT AG C - 
GAGC AT AAAC AG CAT C T CT T CG AC T T AAAAGAAG GAAT T T C T AAAC AT T T 
AT AT AAAAAT C AC GACT CT ATT T TAG AAT CT TAT AC AGG AAG C AT AACT A 
GTGACCCAGAGGT T CCTGAGCAAT ACAAAGAT GAGAC ACGT AAT TTTAAA 
TTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTT 
AAAAG CT TAT CAT AAT ATT GCTGTGTGTTTAGGG AC CTC ACT TGGGGGAA 
AG AGT G CT G GT C AAAAT G C C T T GT AT C AAT T T G AAG AAGG AGAG C GT C AA 
GTAGATGCTAGTTTATTAGAAAAAGCATCTGTTTACCATATTGCTGATGA 
ATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCG 
CCTGTTCT GC AAG T AAT AAT GC C GT AAT AT T AGG AAC AC AAT T AC T T C AA 
GATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGA 
TAT T T CT TT AG C AGGCT T C AC AT C AC T AGG AG C T AT T AAT AC AGAAAT GG 
CATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGCT 
GGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAAAAT 
TAT CGGTGGTCT TAT TACT T C AG AT GGT TAT CAT AT AAC AG C AC C T AAG C 
C AAC AGGTG AAG G G G C G GC AC AGAT T G C AAAG C AG C T AGT G AC T C AAGC A 
GGTATTGACTACAGTGAGATTGACTATATTAACGGTCACGGTACAGGTAC 
TCAAGCTAATGATAAAATGGAAAAAAATATGTATGGTAAGTTTTTCCCGA 
C AAC GAC AT T GAT C AG C AGT AC C AAGGGG C AAACG G GT CAT AC T CT AGG G 
GCTGCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAGGAACA 
GAC T G T AC C AG C AAC T AAAAAT GAG AT T G GG AT AG AAG GTT T T C C AGAAA 
AT T T T GT CT AT CAT C AAAAGAG AG AAT AC C C AAT AAGAAAT G C T T T AAAT 
TTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTTT 
AGATTCACCTCTAGAAACATTACCTGCTAGAGAAAATCTTAAAATGGCTA 
TCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACC 
TATGAAAAAGTTGCTAGTAATTTCAACGACTTTGAAGCATTACGCTTTAA 
AGGG G C T AGAC C AC C C AAAAC T GT C AAC C C AG C AC AAT T T AGG AAAAT GG 
AT GAT T T T T C C AAAAT GGT TGC CGT AAC AAC AG CT C AAG C AC T AAT AG AA 
AGC AAT AT T AAT CT AAAAAAACAAGAT ACTT C AAAAGTAGGAAT TGT AT T 
T AC AAC ACT T T CT GG AC C AGT T G AG GT T GT T G AAG G T AT T G AAAAG C AAA 
T C AC AAC AG AAGGAT AT G C AC AT GTTTCTGCTT C AC GAT T C C C G T T T AC A 
GTAATGAATGCAGCAGCTGGTATGCTTTCTATCATTTTTAAAATAACAGG 
T C C T T TAT C T G T CAT T T C GAC AAAT AGT G GAG C G C T T GAT GGT AT AC AAT 
AT G C C AAG GAAAT GAT G C GT AAC GAT AAT C TAG AC TAT GT GAT T CT T GT T 
T C T G C T AAT C AGT GG AC AG AC AT G AGT T T TAT GT GGT G G C AAC AAT T AAA 
CTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACAAGTCC 
TCTCTCGT C AAG CAT T GG AT AAT T C T C C TAT AAT AT T AGGT AGT AAAC AA 
T T AAAAT AT AG C CAT AAAAC AT T C AC AG AT G T GAT GAC TAT T T T T GAT G C 
T G C G C T T C AAAAT T TAT TAT C AG AC T T AGG AC T AAC CAT AAAAG AT AT C A 
AAG GT T T C GT T T G GAAT G AGC G G AAGAAG G C AGT T AGT T C AG AT TAT GAT 
TTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGG 
T C AGT T T G GAT T T T CAT C T AAT GGTGCTGGT G AAG AAC T GG ACT AT AC T G 
T T AAT G AAAGT AT AG AAAAGG G CT AT TAT T T AG T C C T AT CT T AT T C G AT C 
TTCGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7506 

STRAIN M732 
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ATGTTAGTGGAATAGGAATTATTTCTTCTTTGGGAAAGAATTATAG 
C GAG C AT AAAC AG CAT CT CT T C G ACT T AAAAG AAGG AAT T T C T AAAC AT T 
T AT AT AAAAAT C ACGACTCT ATTTT AGAAT CTT AT AC AGGAAGCAT AACT 
AGT G AC C C AGAGGT T C CT GAG C AAT AC AAAGAT GAG AC AC GT AAT T T T AA 
ATTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATT 
TAAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGA 
AAG AGT G CT G GT C AAAAT G C CT T GT AT C AAT T T G AAG AAG GAG AG CGT C A 
AG TAG AT G CT AGT T T AT T AGAAAAAG CAT C T GT T T AC CAT AT T GC T GAT G 
AATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACC 
GCCTGTTCTGCAAGTAATAATGCCGTAATATTAGGAACACAATTACTTCA 
AGATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTG 
AT AT T T CT T T AGC AG G C T T C AC AT C AC TAG GAG CT AT T AAT AC AG AAAT G 
GCATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGC 
TGGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAAAA 
TTATCGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTAAG 
C C AAC AGGT G AAGG G G C G GC AC AG AT T G C AAAG C AG C T AGT GAC T C AAG C 
AGGT AT T G ACT AC AGT GAG AT T G AC TAT AT T AAC GGT C ACG GT AC AGGT A 
CTCAAGCT AAT GAT AAAAT GGAAAAAAAT AT GTATGGTAAGTTTTTCCCG 
AC AACGAC AT T GAT C AGC AGT AC C AAG G G G C AAAC GGGT CAT AC T CT AG G 
GG CT G C AG GT AT TAT C GAAT T GAT T AAT T GT T TAG CGG C AAT AG AGG AAC 
AGACT GT AC C AGC AACT AAAAAT GAG AT T GG GAT AG AAG GT T TT C CAGAA 
AAT T T T GT C TAT CAT C AAAAGAG AG AAT AC C C AAT AAG AAAT GC T T T AAA 
TTTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTT 
TAGATTCACCTCTAGAAACATTACCTGCTAGAGAAAATCTTAAAATGGCT 
ATCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAAC 
CTATGAAAAAGTTGCTAGTAATTTCAACGACTTTGAAGCATTACGCTTTA 
AAGG GG C TAG AC C AC C C AAAAC T GT C AAC C C AG C AC AAT T T AGG AAAAT G 
GAT GAT T T T T C C AAAAT G GT T G CC G T AAC AAC AGC T C AAGC AC T AAT AG A 
AAGC AAT AT T AAT C T AAAAAAAC AAG AT AC T T C AAAAGT AGG AAT T G T AT 
TTACAACACTTTCTGGACCAGTTGAGGTTGTTGAAGGTATTGAAAAGCAA 
AT C AC AAC AG AAGG AT AT GC AC AT GTTTCTGCTT C ACG ATT C CC GT T T AC 
AGT AAT GAAT G C AG C AG CT GGT AT G C T T T C TAT CAT T T T T AAAAT AAC AG 
GT C CT T T AT CT GT CAT T T CG AC AAAT AGT GG AG C G C T T GAT GGT AT AC AA 
TAT G C C AAGG AAAT GAT G C G T AAC GAT AAT CT AG AC TAT GT GAT T CT T GT 
T T CT G C T AAT C AGT G GAC AG AC AT G AGT T T T AT GT G GT G G C AAC AAT T AA 
ACT AT GAT AGT C AAAT GTTTGTCGGTTCT GAT TAT T GT T C AG C AC AAGT C 
CTCTCTCGT C AAG CAT T G GAT AAT T C T C C TAT AAT AT TAG GT AGT AAAC A 
AT T AAAAT AT AGC CAT AAAAC AT T C AC AG AT GT GAT G AC T AT TT T T GAT G 
CTGCGCTT C AAAAT T TAT TAT C AG AC T T AGG ACT AAC CAT AAAAG AT AT C 
AAAG GT T T C G T T T G GAAT G AGC G G AAG AAGG C AGT T AGT T C AGAT TAT G A 
TTTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTG 
GTCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTAtaCT 
GT T AAT G AAAGT AT AG AAAAGG G C T AT TAT T T AGT C CT AT C T TAT T C GAT 
CTTCGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7507 

STRAIN COH1 

AT GT T AGT G GAAT AG GAAT TAT TTCTTCTT T GGG AAAG AAT TAT AGC 

GAGCAT AAACAGC AT CT CT T CGACTT AAAAGAAGGAATT T CTAAAC AT TT 

ATATAAAAATCACGACTCTATTTTAGAATCTTATACAGGAAGCATAACTA 

G T GAC C C AG AGGT T C CT GAG C AAT AC AAAG AT GAG AC AC GT AAT T T T AAA 

TTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTT 

AAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGAA 

AGAGT G C T GGT C AAAATG C C T T GT AT C AAT T T G AAG AAG GAG AG C GT C AA 

GTAGATGCTAGTTTATTAGAAAAAGCATCTGTTTACCATATTGCTGATGA 

ATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCG 

CCTGTTCTGCAAGTAATAATGCCGTAATATTAGGAACACAATTACTTCAA 

GATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGA 

TATTTCTTTAGCAGGCTTCACATCACTAGGAGCTATTAATACAGAAATGG 

CATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGCT 

GGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAAAAT 

TATCGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTAAGC 

C AAC AG GT G AAG G GG C G G C AC AG AT T G C AAAG C AG CT AGT G ACT C AAG C A 

GGTATTGACTACAGTGAGATTGACTATATTAACGGTCACGGTACAGGTAC 
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T C AAG CT AAT GAT AAAAT G G AAAAAAAT AT G T AT G GT AAGT T T T T C C C GA 
C AACG AC AT T GAT C AG C AGT AC C AAGGG G C AAAC GGGT CAT AC T CT AGG G 
G C T G C AG GT AT TAT C GAAT T GAT T AAT T GT T TAG C G G C AAT AGAGG AAC A 
GACTGTACCAGCAACTAAAAATGAGATTGGGATAGAAGGTTTTCCAGAAA 
ATT T T GT CT AT CAT CAAAAG AGAGAAT AC C C AAT AAG AAAT GC T T T AAAT 
TTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTTT 
AG AT T C ACCT CT AG AAAC AT T AC CT GC TAGAG AAAAT CT T AAAAT G G CT A 
TCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACC 
TAT GAAAAAGT T GCT AGT AAT T T C AAC GACT T T GAAG C AT T AC G C T T T AA 
AG G G G CT AGACC ACC C AAAAC T GT C AAC C C AG C AC AAT T TAG G AAAAT GG 
AT GAT T T T T C C AAAAT G GT T G C CGT AAC AAC AGC T C AAG C ACT AAT AG AA 
AG C AAT AT T AAT CT AAAAAAAC AAGAT AC T T C AAAAGT AG GAAT T GT AT T 
T AC AAC AC T T T CT GG AC C AGT T GAG GT T GT T G AAGGT AT T G AAAAG C AAA 
T C AC AAC AGAAGGAT AT G C AC AT GT T T C T G C T T C AC GAT T C C C GT T T AC A 
G T AAT GAAT G C AG C AG C T GGT AT GCT T T CT AT CAT T T T T AAAAT AAC AGG 
TCCTTTATCTGTCATTTCGACAAATAGTGGAGCGCTTGATGGTATACAAT 
ATGCCAAGGAAATGATGCGTAACGATAATCTAGACTATGTGATTCTTGTT 
T C T G C T AAT C AGT GG AC AGAC AT G AGT T T TAT GT GGT G G C AAC AAT T AAA 
CTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACAAGTCC 
TCTCTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTAAACAA 
T T AAAAT AT AG C CAT AAAAC AT T C AC AGAT GT GAT G AC TAT T T T T GAT GC 
T G CG CT T C AAAAT T TAT TAT C AGACT T AGG AC T AAC CAT AAAAG AT AT C A 
AAGGT TTCGTTTG GAAT GAG C G GAAG AAG G C AG T T AGT T C AG AT TAT GAT 
TTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGG 
TCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATACTG 
T T AAT GAAAGT AT AG AAAAG G G C T AT TAT T T AGT C C T AT CT T AT T C GAT C 
TTCGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7508 

STRAIN M781 

ATGTTAGTGGAATAGGAATTATTTCTTCTTTGGGAAAGAATTATAGC 
GAG CAT AAA C AG CAT C T C T T C G AC T T AAAAG AAGG AAT T T C T AAAC AT T T 
ATATAAAAATCACGACTCTATTTTAGAATCTTATACAGGAAGCATAACTA 
G T G AC C C AG AG GT T C C T GAG C AAT AC AAAG AT GAG AC ACGT AAT T T T AAA 

TTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTT 
AAAAG C T TAT CAT AAT AT TGCTGTGTGT T T AG GG AC C T C AC T T G GG GG AA 
AG AGT GC T GGT C AAAAT GC C T T GT AT C AAT T T GAAG AAG GAG AGC GT C AA 
G TAG AT G CT AGT T TAT T AG AAAAAG CAT C T GT T T ACC AT AT T G C T GAT G A 
ATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCG 
CCTGTTCTG CAAGT AAT AAT G C C G T AAT AT T AGG AAC AC AAT T ACT T C AA 

GATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGA 
TATTTCTTTAGCAGGCTTCACATCACTAGGAGCTATTAATACAGAAATGG 
CAT GTCAGCCCTATTCTTCTGGAAAAGGAATCAATTT GGGT GAGGGCGCT 
GGT TTTGTTGTTCTTGTCAAAGATCAGTCCT TAG CT AAAT ATGGAAAAAT 
TATCGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTAAGC 
C AAC AGGT GAAGGGGC G G C AC AG AT T G C AAAG C AGC T AGT G AC T C AAG C A 

GGTATTGACTACAGTGAGATTGACTATATTAATGGTCACGGTACAGGTAC 
TCAAGCTAATGATAAAATGGAAAAAAATATGTATGGTAAGTTTTTCCCGA 
C AACGAC AT T GAT C AG C AGT AC C AAG GG G C AAAC GGGT CAT AC T CT AGG G 
GC T G C AGGT AT TAT C GAAT T GAT T AAT T GT T T AG CGG C AAT AG AGG AAC A 
G AC T GT AC C AG C AAC T AAAAAT GAG AT T GG GAT AG AAGGT T T T C C AG AAA 
AT T T T GT C T AT CAT CAAAAG AG AG AAT AC C C AAT AAG AAAT GCT T T AAAT 
TTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTATCTTATTGTCATCTTT 
AGAT T C AC C T C TAG AAAC AT T AC C T G C T AG AGAAAAT C T T AAAAT G G CT A 

TCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACC 
TATGAAAAAGTTGCTAGTAATTTCAACGACTTTGAAGCATTACGCTTTAA 
AGG G G C TAG AC C AC C C AAAAC TGT C AAC C C AG C AC AAT T TAG G AAAAT GG 
AT GAT T T T T C C AAAAT GGTTGCCG T AAC AAC AG C T C AAG C ACT AAT AG AA 
AG C AAT AT T AAT CT AAAAAAAC AAG AT AC T T CAAAAG TAG GAAT T G T AT T 
TACAACACTTTCTGGACCAGTTGAGGTTGTTGAAGGTATTGAAAAGCAAA 
T C AC AAC AG AAGG AT AT G C AC AT GTTTCTGCTT C AC GAT T C C C GT T T AC A 
GT AAT GAAT G C AG C AG C T GG T AT GC T T T CT AT CAT T T T T AAAAT AAC AG G 

TCCTTTATCTGTCATTTCGACAAATAGTGGAGCGCTTGATGGTATACAAT 
AT GC C AAGG AAAT GAT G C GT AAC GAT AAT C TAG AC T AT G T GAT T C T T GT T 
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TCTGCTAATCAGTGGACAGACATGAGTTTTATGTGGTGGCAACAATTAAA 
CTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACAAGTCC 
TCTCTCGT CAAG C AT TGG AT AAT T CT C CT AT AAT AT T AGGT AGT AAAC AA 
T T AAAAT ATAGCC AT AAAACATT CAC AGATGT GAT GACTATTTT T GATGC 
TGCGCTTCAAAATTTATTATCAGACTTAGGACTAACCATAAAAGATATCA 
AAGG T TTCGTTTG G AAT GAG C GGAAGAAGGC AGT T AGT T GAG AT TAT GAT 
TTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGG 
TCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATACTG 
TTAATGAAAGTATAGAAAAGGGCTATTATTTAGTCCTATCTTATTCGATC 
TTTGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7509 

STRAIN CJB110 

ATGTTAGTGGAATAGGAATTATTTCTTCTTTGGGAAAGAATTATAGC 
GAGC ATAAACAGCAT CT CT T CGACT T AAAAGAAGG AATTT CT AAACAT T T 
ATATAAAAATCACGACTCTATTTTAGAATCTTATACAGGAAGCATAACTA 
GTGAC CCAGAGGTT C CT GAGC AAT AC AAAGAT GAG ACACGT AATT TT AAA 
TTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTT 
AAAAG C T T AT CAT AAT AT TGCTGTGT GT T T AG G G AC C T C ACT T G G GG G AA 
AGAGT GCT GGT CAAAAT GCCT TGT AT CAAT T TGAAGAAGGAGAGCGT CAA 
GTAGATGCTAGTTTATTAGAAAAAGCATCTGTTTACCATATTGCTGATGA 
ATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCG 
CCTGTTCTGCAAGTAATAATGCCGTAATATTAGGAACACAATTACTTCAA 
GATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGA 
TAT T T C T T T AGC AGG C T T CAC AT C AC T AGG AG C T AT T AAT AC AG AAAT G G 
CATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGCT 
GGTTTTGTTGTTCTTGTCAAAGATCAGTCCTT AGC T AAAT ATGGAAAAAT 
TATCGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTAAGC 
C AAC AGG T G AAG G GG C G G CAC AG AT T GC AAAG C AG C T AGT G AC T C AAGC A 
GGT AT T G AC T AC AGT GAG AT T G ACT AT AT T AAT G GT CAC G GT AC AG GT AC 
T CAAG C T AAT GAT AAAAT G G AAAAAAAT AT G T AT GGT AAGT T T T T C C C GA 
C AAC G AC AT T GAT C AGC AGT AC CAAG G GGC AAAC G G GT CAT AC T C TAG G G 
G C T G C AGGT AT TAT C G AAT T GAT T AAT T GT T T AG C G G C AAT AGAG G AAC A 
G AC T GT AC C AG C AAC T AAAAAT GAG AT T GGG AT AG AAG GT T T T C C AG AAA 
ATT TT GT CT AT CAT C AAAAG AGAG AAT AC C CAAT AAG AAAT GCT T T AAAT 
TTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTATCTTATTGTCATCTTT 
AGATTCACCTCTAGAAACATTACCTGCTAGAGAAAATCTTAAAATGGCTA 
TCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACC 
TAT G AAAAAG T T G C T AGT AAT T T C AACGAC T T T GAAGC AT T AC G CT T T AA 
AGGGGCTAGACCACCCAAAACTGTCAACCCAGCACAATTTAGGAAAATGG 
ATGATTTTTCCAAAATGGTTGCCGTAACAACAGCTCAAGCACTAATAGAA 
AG CAAT AT T AAT C T AAAAAAAC AAGAT AC T T C AAAAGT AGG AAT T GT AT T 
TACAACACTTTCTGGACCAGTTGAGGTTGTTGAAGGTATTGAAAAGCAAA 
T CAC AAC AG AAGG AT AT GC AC AT GT T T C T G CT T CAC GAT T C CC G T T T AC A 
GTAATGAATGCAGCAGCTGGTATGCTTTCTATCATTTTTAAAATAACAGG 
T C CT T T AT CT GT CAT T T CG AC AAAT AG T G GAG C G CT T GAT GGT AT AC AAT 
AT G C C AAGG AAAT GAT G CG T AAC GAT AAT CT AG AC TAT GT G AT T C T T GT T 
T C T G CT AAT C AGT GG AC AG AC AT G AGT T T T AT GT G GT GG C AAC AAT T AAA 
C T AT GAT AGT C AAAT GTTTGTCGGTTCT GAT TAT T GT T C AG CAC AAG T C C 
TCTCTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTAAACAA 
T T AAAAT AT AG C C AT AAAAC AT T CAC AG AT GT G AT G ACT AT T T T T GAT G C 
T G CG C T T CAAAAT T TAT TAT C AGAC T T AG G ACT AAC CAT AAAAG AT AT C A 
AAGGT TTCGTTTG G AAT GAG C G G AAGAAG G C AGT T AGT T C AG AT TAT GAT 
TTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGG 
T C AGT T T G GAT T T T CAT C T AAT GGT G C T G GT G AAG AAC T GG AC TAT ACT G 
TTAATGAAAGTATAGAAAAGGGCTATTATTTAGTCCTATCTTATTCGATC 
TTTGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7510 

STRAIN 1169NT 

AT GT T AGT GG AAT AGG AAT TAT TTCTTCTTTGG G AAAG AAT T AT AG 

CG AG CAT AAAC AGC AT CT C T T C G AC T T AAAAG AAGG AAT T T C T AAAC AT T 

TATATAAAAATCACGACTCTATTTTAGAATCTTATACAGGAAGCATAACT 

AGTGACCCAGAGGTTCCTGAGCAATACAAAGATGAGACACGTAATTTTAA 
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ATTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATT 
TAAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGA 
AAGAGTGCTGGTCAAAATGCCTTGTATCAATTTGAAGAAGGAGAGCGTCA 
AGT AG AT G C T AGT T TAT T AG AAAAAG CAT CT GT T T AC CAT AT T G C T GAT G 
AAT TG AT GGC T TAT CAT GAT AT T GT G G GAG C T T CGT AT GT TAT T T C AAC C 
G C CT GT T CT GC AAGT AAT AAT G C C GT AAT AT T AGG AAC AC AAT T ACT T C A 
AG AT GG C GAT T GT GAT T T AG C TAT T TGT G GT GG CT GT GAT G AGT T AAGT G 
ATATTTCTTTAGCAGGCTTCACATCACTAGGAGCTATTAATACAGAAATG 
GCATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGC 
TGGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAAAA 
T TAT CGGTGGTCT T AT T AC T T C AG AT G GT T AT CAT AT AAC AG C AC CT AAG 
CCAACAGGTGAAGGGGCGGCACAGATTGCAAAGCAGCTAGTGACTCAAGC 
AGG TAT T GAC TAG AG T GAG AT T G AC TAT AT T AACG GT C AC GGT AC AGGT A 
CTCAAGCTAATGATAAAATGGAAAAAAATATGTATGGTAAGTTTTTCCCG 
AC AAC GAC AT T GAT C AG C AGT ACC AAGGG G C AAAC G G G T CAT AC T C T AGG 
GGCTGCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAGGAAC 
AGACT GT AC C AG C AAC T AAAAAT GAGATT GGGAT AGAAGGTTT T CC AGAA 
AATTT TGT CT AT CAT CAAAAGAGAGAATACCCAATAAGAAATGCTTT AAA 
TTTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTATCTTATTGTCATCTT 
TAGATTCACCTCTAGAAACATTACCTGCTAGAGAAAATCTTAAAATGGCT 
AT CT T AT CAT CTGTTGCTTC CAT T T C T AAG AAT G AAT C ACTT T CT AT AAC 
CTATGAAAAAGTTGCTAGTAATTTCAACGACTTTGAAGCATTACGCTTTA 
AAGG G G C TAG AC C AC C C AAAAC T GT C AAC C C AG C AC AAT T T AGG AAAAT G 
GAT GAT T T TT C C AAAAT GGT T G C C GT AAC AAC AG C T C AAG C ACT AAT AGA 
AAGCAATATTAAT CT AAAAAAACAAGAT ACTT CAAAAGTAGGAATT GTAT 
TTACAACACTTTCTGGACCAGTTGAGGTTGTTGAAGGTATTGAAAAGCAA 
AT C AC AAC AGAAGG AT AT G C AC AT GTTTCTGCTT C AC GAT T C C CGT T T AC 
AG T AAT GAAT GC AG C AG C T G G T AT G C T T T C TAT CAT T T T T AAAAT AAC AG 
GT C C T T TAT C T G T CAT T T C G AC AAAT AGT GG AG C G CT T GAT G G T AT AC AA 
TAT G C C AAGG AAAT GAT GC GT AAC GAT AAT CT AG AC TAT GT GAT T C TT GT 
TTCTGCTAATCAGTGGACAGACATGAGTTTTATGTGGTGGCAACAATTAA 
ACTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACAAGTC 
CTCTCTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTAAACA 
AT T AAAAT AT AG C CAT AAAAC AT T C AC AGAT G T GAT GAC TAT T T T T GAT G 
CTGCGCT T CAAAAT TT ATT AT CAGACT T AGGACT AACCAT AAAAGAT AT C 
AAAG GTTTCGTTTG GAAT GAG C GG AAG AAGG C AGT TAG T T C AGAT TAT G A 
TTTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTG 
GTCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATACT 
GTTAATGAAAGTATAGAAAAGGGCTATTATTTAGTCCTATCTTATTCGAT 
CTTTGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7511 

STRAIN JM9130013 

ATGTTAGTGGAATAGGAATTATTTCTTCTTTGGGAAAGAATTATAGCGAG 
CAT AAAC AG CAT C T CT T CG AC T TAAAAG AAGG AAT T T C T AAAC AT T T AT A 
T AAAAAT CACGACTCTATTTTAGAATCTT AT ACAGGAAGCATAACTAGTG 
AC C C AG AG GT T C CT G AG C AAT AC AAAG AT G AGAC AC G T AAT T T T A^AT T T 
GCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTTAAA 
AGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGAAAGA 
GTGCTGGT CAAAATGCCTTGT AT C AATTT GAAGAAGG AG AGCGT C AAGT A 
GAT G C TAG T T TAT TAG AAAAAG CAT C T GT T T AC CAT AT T G CT GAT GAAT T 
GATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCGCCT . 
GTT CT GCAAGT AATAAT GC CGT AAT ATT AGGAACAC AATT ACT T CAAGAT 
GGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGATAT 
T T CT T T AG C AGG CT T C AC AT C AC T AGG AG CT AT T AAT AC AG AAAT G G CAT 
GTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGCTGGT 
T T T GT T GTT CT TGT C AAAGAT CAGT CCTT AGCT AAAT AT GGAAAAATT AT 
CGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTAAGCCAA 
CAGGTGAAGGGGCGGCACAGATTGCAAAGCAGCTAGTGACTCAAGCAGGT 
ATTGACTACAGTGAGATTGACTATATTAACGGTCACGGTACAGGTACTCA 
AGCTAATGATAAAATGGAAAAAAATATGTATGGTAAGTTTTTCCCGACAA 
CG AC AT T GAT C AG CAGT AC C AAG G G G C AAAC GG G T CAT AC T C TAG G GG C T 
GCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAGGAACAGAC 
T GT AC C AG C AAC T AAAAAT GAG AT T G GG AT AG AAG GT T T T C C AG AAAAT T 
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T T GT CT AT CAT C AAAAG AG AG AAT AC C C AAT AAG AAAT GC T T T AAAT TT T 
TCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTTTAGA 
TTCACCTCTAGAAACATTACCTGCTAGAGAAAATCTTAAAATGGCTATCT 
TATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACCTAT 
G AAAAAG T T G C T AGT AAT TT C AAC GAC T T T GAAG CAT T AC GCT T T AAAG G 
GG CT AGACCAC C C AAAAC T G T C AAC C C AG C AC AAT T T AGG AAAAT G GAT G 
ATTTTTCCAAAATGGTTGCCGTAACAACAGCTCAAGCACTAATAGAAAGC 
AATATTAATCTAAAAAAACAAGATACTTCAAAAGTAGGAATTGTATTTAC 
AAC AC T TTC T G G ACC AGT T G AG GT T GT T GAAG G T AT T G AAAAG C AAAT C A 
C AAC AG AAG GAT AT G C AC AT GTTTCTGCTT C AC GAT T C C CGT T TAG AGT A 
ATGAATGCAGCAGCTGGTATGCTTTCT AT CAT TTTT AAAAT AAC AGGTCC 
TTTATCTGTCATTTCGACAAATAGTGGAGCGCTTGATGGTATACAATATG 
C C AAGG AAAT GAT GCGT AAC GAT AAT CT AG AC TAT GT G ATT CT T GT T T CT 
GCT AAT C AGT G GAC AG AC AT G AGT T T T AT GT G G T GGC AAC AAT T AAAC T A 
T GAT AGT C AAAT GTTTGTCGGTTCT GAT TAT T GT T C AG C AC AAGT C CT CT 
CTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTAAACAATTA 
AAATATAGCCATAAAACATTCACAGATGTGATGACTATTTTTGATGCTGC 
GCTTCAAAATTTATTATCAGACTTAGGACTAACCATAAAAGATATCAAAG 
GT T T C GT T T GG AAT GAGC G G AAGAAGG C AGT T AGT T C AG AT TAT GAT TTC 
TTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGGTCA 
GTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATACTGTTA 
AT G AAAGT AT AG AAAAG GG C TAT TAT T T AGT C C T AT C T TAT T C GAT CT T C 
GGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7512 
STRAIN 2 603 frame: 1 

MSVYVSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQ 
YKDETRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQV 
DAS LLEKAS VYHI ADELMAYHD I VGAS YVI S TAG S ASNNAVI LGTQLLQDGDCDLAI CGG 
CDELSDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFVVLVKDQSLAKYGKIIGGL 
ITSDGYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKF 
FPTTTLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKR 
EYPIRNALNFSFAFGGNNSGVLLSSLDSPLETLPARENLKMAILSSVASISKNESLSITY 
E KVA SN FN D FE ALR FKG AR P P KT VN P AQ FRKM D D F S KM VAVT T AQ AL IESNINLKKQDTS 
KVGIVFTTLSGPVEVVEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSV 
ISTNSGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSA 
QVLSRQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNER 
KKAVSSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTVNESIEKGYYLVLSYSIF 
GGISFAIIEKR 

SEQ ID NO. 7513 

STRAIN 090 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKAS VYHI ADELMAYHDI VGAS YVI STAGS ASNNAVILGTQLLQDGDCDLAICGGCDEL 
S D I S LAG FT S LGAINTEMACQP YS S GKGINLGEGAGFVVLVKDQS LAKYGKI I GGL IT S D 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFSFAFGGNNSGILLSSLDSPLETLPARENLKMAILSSVASISKNESLSITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEVVEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
S S D YDFLANLSE YYNMPNLASGQFGFS SNGAGEELD YT VNE S IEKGYYLVLS YS I FGGIS 
FAIIEKR 

SEQ ID NO. 7514 

STRAIN A90 9 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKAS VYHI ADELMAYHDI VGAS YVI STACS ASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFVVLVKDQSLAKYGKIIGGLITSD 
GYHI TAPKPTGEGAAQ I AKQLVTQAGI DYSE I DYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
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RNALNFSFAFGGNNSGVLLSSLDSPLETLPARENLKMAILSSVASISKNESLSITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 

SSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTVNESIEKGYYLVLSYSIFGGIS 
FAIIEKR 

SEQ ID NO. 7515 

STRAIN H3 6B frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVISTACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFWLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFSFAFGGNNSGVLLSSLDSPLETLPARENLKMAILSSVASISKNESLSITYEKVA 
S N FN D FE ALR FKG AR P P KT VN P AQ FRKM D D F S KMV AVT T AQ AL I E S N I N LKKQ DT S KVG I 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
S S DYDFLANLSE YYNMPNLASGQFGFS SNGAGEELDYTVNE S IEKGYYLVLS YS I FGGIS 
FAIIEKR 

SEQ ID NO. 7516 

STRAIN 18RS21 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVISTACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFWLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFS FAFGGNNSGVLLS SLD S PLETLPARENLKMAI LS S VAS I SKNE SLS IT YEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTVNESIEKGYYLVLSYSI FGGIS 
FAIIEKR 

SEQ ID NO. 7517 

STRAIN M732 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKAS VYHI ADELMAYHD I VGAS YVI STACS ASNNAVI LGTQLLQDGDCDLAI CGGCDE L 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFWLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFS FAFGGNNSGVLLS SLD SPLETLPARENLKMAILSSVASISKNE SLS IT YEKVA 
S N FN D FE AL R FKG AR P P KT VN P AQ FRKM D D F S KMV AVT T AQ AL I E SN I N LKKQ D T S KVG I 
VFTTLSGPVEWEGIEKQITTEGYAHVS ASRFPFTVMNAAAGMLS 1 1 FKITGPLS VI STN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQ AL DN SPIILGSKQLKYS HKT FT D VMT I F D AAL QN LLSDLGLT-IKDI KG F VWN E RKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTVNESIEKGYYLVLSYSIFGGIS 
FAIIEKR 

SEQ ID NO. 7518 

STRAIN COH1 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDI VGAS YVI STACS ASNNAVILGTQLLQDGDCDLAICGGCDEL 
S DI SLAGFT SLGAINTEMACQPYS SGKGINLGEGAGFVVLVKDQSLAKYGKI IGGLITS D 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
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RNALNFS FAFGGNNSGVLLS SLDS PLETLPARENLKMAILSS VAS I SKNESLS ITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEVVEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTVNESIEKGYYLVLSYSIFGGIS 
FAIIEKR 

SEQ ID NO. 7519 

STRAIN M7 81 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVISTACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFVVLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALN F S FAFGGNN SGILLSSLDS PLET L PARENLKMAI L S S VAS I S KNE S LS I T YEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
S SDYDFLANLSE YYNMPNLASGQFGFS SNGAGEELDYTVNES IEKGYYLVLSYS I FGGI S 
FAIIEKR 

SEQ ID NO. 7520 

STRAIN CJB110 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVISTACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFWLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFS FAFGGNN S GILLS SLDSPLETLPARENLKMAILSSVAS I SKNESLS ITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFS SNGAGEELDYTVNES IEKGYYLVLSYS I FGGI S 
FAIIEKR 

SEQ ID NO. 7521 

STRAIN 1169NT frame: 3 

VSGIGI I S SLGKNYSEHKQHLFDLKEGI SKHLYKNHDS ILE S YTGS I TS DPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVI STAGS ASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFWLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLI S STKGQTGHTLGAAG I IE LINCLAAIEEQTVPATKNE I G.IEGFPEN FVYHQKRE YP I 
RNALNFSFAFGGNNSGILLS SLDSPLETLPARENLKMAILSSVAS I SKNESLS ITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVS ASRFPFTVMNAAAGMLS 1 1 FKITGPLS VI STN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTVNESIEKGYYLVLSYSIFGGIS 
FAIIEKR 

SEQ ID NO. 7522 

STRAIN JM9130013 frame: 3 

VSGIGI IS SLGKNYSEHKQHLFDLKEGI SKHLYKNHDSILESYTGS ITS DPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVI STACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFVVLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
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RNALNFSFAFGGNNSGVLLSSLDSPLETLPARENLKMAILSSVASISKNESLSITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKBCAV 
SSDYDFLANLSE YYNMPNLASGQFGFS SNGAGEELDYT VNES IEKGYYLVLS YS I FGGI S 
FAIIEKR 

SEQ ID NO. 7601 
STRAIN 2603 

AT G AAAAAAGT CAT C GAT T T AAAAAAACT AC AAAAAG CAT AT G C CT C AG AAAC C GT T T T A 
AAT AAT AT T AAT T T G GAG G T GT T T AAAGG CGAAAT AAT T G G AT T AAT AG G AC C C T C T GGA 
G C AGGGAAAT C T AC C T T GAT T AAAACT AT GC T T G G CAT GGAAAAAG C AG AT AAGGG AAC A 
GCTCTTGTTCTTGATACTCAAATGCCAGATCGTAATATTTTAAATCAAATTGGCTATATG 
GCTCAATCTGATGCCTTATACGAGTCTTTAACTGGCTTAGAAAATTTATTATTCTTTGGA 
AAAATGAAAGGT ATT CAAAAAACTGAAT TAAAACAGCAGAT AACT CATAT TT CT AAAGT A 
GT AG AT C T AG AAAAC C AAC T T GAT AAAT TT GT CT CAGGT TACT CAGG AGGTATGAAAAGA 
CGGCTTTCTCTAGCCATCGCCCTACTTGGAAACCCCACAGTTTTAATCCTAGATGAACCT 
AC C GT T G GAAT T GAT C CAT C C T T GAG G AGAAAAAT CT G G C AAGAG CT AAT T AAT AT T AAG 
GAT GAAGGAC ATT CT AT CTT T ATTACAACCCACGTTAT GGAT GAAGC AGAATT AACAAGT 
AAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACTCCATTACATTTAAAA 
AAAC AAT T T AAT GT GAG TACT ATT GAGGAAGTT T T CT T AAAAG CT GAAG G AG AA 

SEQ ID NO. 7602 

STRAIN 090 

AT T TAAAAAAAC T AC AAAAAG CAT AT G C CT C AG AAACT GT T T T AAAT AAT 
AT T AAT T T G GAG G T G T T T AAAGG C G AAAT AAT T GG AT T AAT AGG AC C CT C 
TGGAGCAGGGAAATCTACCTTGATTAAAACTATGCTTGGC AT GGAAAAAG 
C AG AT AAG GG AAC AG CTCTTGTTCT TG AT ACT C AAAT G C C AG AT C G T AAT 
AT T T T AAAT C AAAT T GGCT AT AT GG CT C AAT C T GAT G C CT T AT AC GAAT C 
TTTAACTGCCTTAGAAaATTTATTATTCTTTGGAAAAATGAAAGGTATTC 
AAAAAACT G AAT TAAAACAGCAGAT AAC T CATAT T T c T AAAGT AGTAGAT 
CT AGAAAACCAACTT GAT AAAT TT GTCT C AGGTTACT C AGGAGGT ATGAA 
AAGACGGCTTTCTCTAGCCATCGCCCTACTTGGAAACCCCACAGTTTTAA 
T C C TAG AT G AAC CT AC C GT T G GAAT T GAT C CAT CC T T GAG G AG AAAAAT C 
T GG C AAGAG CT AAT T AAT AT T Aa G GAT G AAGG AC GT T CT AT CTT TAT T AC 
AACCCACGTTATGGATGAAGCAGAATTAACAAGTAAGGTTGCACTACTAT 
T ACGT GG AAAC AT TAT T GC CT T T GAT ACT C CAT T AC AT T TAAAAAAAC AA 
TTTAATGTGAGTACTATtGAGGAAGTTTTCTTAAAAGCTGAAGGAGAA 

SEQ ID NO. 7 603 

STRAIN A909 ( 
AAAAAAGT CAT C GAT T TAAAAAAAC T AC AAAAAG C AT AT GCCT C A 
GAAACCGTTTT AAAT AAT ATT AAT TTGGAGGTGTTTAAAGGCGAAAT AAT 
T G GAT T AAT AGG AC C C T C T G G AGC AG GG AAAT CT AC CTT GAT T AAAAC T A 
T G C T T GG CAT GGAAAAAG C AG AT AAG GG AAC AG CT CTTGTTCTT GAT ACT 
CAAATGCCAGAT CAT AAT AT T T T AAAT CAAAT T GG CT AT AT GGCT CAAT C 
TGATGCCTTATACGAGT CTT T AACT GGCT TAGAAAATTT ATT ATT CTTTG 
GAAAAATGAAAGGTATTCAAAAAACTGAATTAAAACAGCAGATAACTCAT 
AT T T C T AAAGT AGTAGAT C T AGAAAACC AAC T T GAT AAAT T T GT C T CAGG 
TTACTCAGGAGGTATGAAAAGACGGCTTTCTCTAGCCATCGCCCTACTTG 
GAAAC C C C AC AGT T T T AAT C C TAG AT G AAC C T ACC GT T G GAAT T GAT C C A 
T C C T T GAG GAG AAAAAT C T G G C AAGAG CT AAT T AAT AT T AAG GAT GAAG G 
ACGT T C TAT CTT TAT T AC AAC C C AC GT TAT GGAT GAAG C AG AAT T AAC AA 
GTAAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACTCCA 
TTACATTTAAAAAAACAATTTAATGTGAGTACTATTGAGGAAGTTTTCTT 
AAAAGCTGAAGGAGAA 

SEQ ID NO. 7604 

STRAIN H3 6B 

AAAAAAGT CATTGAT T T AAAAAAACT AC AAAAAGC AT ATG CC 

T C AG AAAC C GT T T T AAAT AAT AT T AAT T T G G AGGT GT T T AAAG G C G AAAT 

AATTGGATTAATAGGACCCTCTGGAGCAGGGAAATCTACCTTGATTAAAA 

CTATGCTTGGCATGGAAAAAGCAGATAAGGGAaCAGCTCTTGTTCTTGAT 
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AC T C AAAT G C C AG AT CGT AAT AT T T T AAAT C AAAT T G G C TAT AT GG C T C A 
ATCTGATGCCTTATACGAGTCTTTAACTGGCTTAGAAAATTTATTATTCT 
TTGGAAAAATGAAAGGTATTCAAAAAACTGAATTAAAACAGCAGATAACT 
CAT AT T T CT AAAGT AGT AG AT C T AG AAAAC C AAC T T GAT AAAT T T GT CT C 
AG GT T AC T C AG GAG GT AT G AAAAG ACGGC T T T C T C TAG C CAT CG C CC T AC 
T T G G AAAC C C C AC AGT T T T AAT C C TAG AT GAAC C T AC C GT T G G AAT T GAT 
C CAT C C T T GAG GAGAAAAAT C T GG C AAG AGC T AAT T AAT AT T AAGG AT GA 
AGGACGTTCTATCTTTATTACAACCCACGTTATGGATGAAGCAGAATTAA 
CAAGTAAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACT 

CCATTACATTTAAAAAAACAATTTAAT'GTGAGTACTATTGAGGAAGTTTT 
CT T AAAAGCT GAAGGAGAA 

SEQ ID NO. 7605 

STRAIN 18RS21 

GAT T T AAAAAAAC T AC AAAAAG C AT AT GC C T C AG AAAC C GT T T T AAAT AA 
TATTAATTTGGAGGTGTTTAAAGGCGAAATAATTGGATTAATAGGACCCT 
CT GGAG C AG G GAAAT C T AC c T T GAT T AAAAC TAT G C T T G G CAT GG AAAAA 
G C AG AT AAGGGAAC AGC T CT T G T T C T T GAT AC T C AAAT G C C AGAT C GT AA 
T ATT T T AAAT C AAAT T G G C TAT AT G G CT C AAT c T GAT G C C T T AT ACG AGT 
CTTTAACTGGCTTAGAAAATTTATTATTCTTTGGAAAAATGAAAGGTATT 
CAAAAAACTGAATTAAAACAGCAGATAACTCATATTTCTAAAGTAGTAGA 
TCTAGAAAACCAACTTGATAAATTTGTCTCAGGTTACTCAGGAGGTATGA 
AAAGACGGCTTTCTcTAGCCATCGCCCTACTTGGAAACCCCACAGTTTTA 
AT C C TAG AT GAAC C T AC CGT T GGAAT T GAT C CAT C CT T G AGG AG AAAAAT 
C T G G C AAGAG CT AAT T AAT AT T Aa GG AT GAAG G AC AT T C T AT CT T T AT T A 
C AAC C C ACG T TAT GG AT GAAG C AGAAT T AAC AAGT AAG G T T G C AC T AC T A 
TTACGTGGAAACATTATTGCCTTTGATACTCCATTACATTTAAAAAAACA 
AT T T AAT GT G AGT ACT AT T GAG G AAGT T T T C T T AAAAG C T GAAGGAGAA 

SEQ ID NO. 7606 

STRAIN M732 

AAAAAAGT CAT C GAT T T AAAAAAAC T AC AAAAAG CAT AC G C CT C A 

G AAAC T G T T T T AAAT AAT AT T AAT T T G GAG G T GT T T AAAG G AGAAAT AAT 

T G GAT T AAT AGG AC C CT C T GG AG C AG GG AAAT CT AC C T T G AT T AAAAC T A 

TGCTTGGCATGGAAAAAGCAGAT AAGGGAAC AGCTCTTGTTCTTGATACT 
C AAAT G C C AG AT C GT AAT AT T T T AAAT C AAAT T GG C TAT AT GG C T C AAT C 

TGATGCCTTACACGAGTCTTTAACTGGCTTAGAAAATTTATTATTCTTTG 
GAAAAAT G AAAG GT AT T CAAAAAACT G AAT T AAAAC AG C AG AT AAC T CAT 
AT T T C T AAAGT AGT AG AT C TAG AAAAC C AAC T T GAT AAAT T T GT C T C AG G 
TTACTCAGGAGGTATGAAAAGACGGCTTTCTCTAGCCATCGCCCTACTTG 
G AAAC CC C AC AGT T T T AAT C C TAG AT GAAC C TAG CGT T GGAAT T GAT C C A 
T C C T T G AGG AG AAAAAT C T G GC AAG AGC T AAT T AAT AT T AAG GAT GAAG G 
AC GT T CT AT C T T TAT TAG AAC C C AC GT TAT G GAT G AAG C AG AAT T AAC AA 

GTAAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACTCCA 
T T AC AT T T AAAAAAAC AAT T T AAT G T GAG TACT AT T G AGG AAG T T T T C T T 
AAAAG C T GAAG G AG AA 

SEQ ID NO. 7607 

STRAIN COH1 

AAAAAAGT CAT CGATTT AAAAAAACTACAAAAAGC AT ACGCCTCAGAA 
AC T GT T T T AAAT AAT AT T AAT T T G G AGG T G T T T AAAGG AG AAAT AAT T G G 
AT T AAT AG G AC C C T C T GG AGC AG GG AAAT CT AC CT TGAT T AAAAC TAT G C 
T T G G CAT GG AAAAAG C AGAT AAG GG AAC AG CT CT T GT T C T T GAT AC T C AA 
AT G C C AG AT C GT AAT AT T T T AAAT C AAAT T GG C TAT AT G G C T C AAT CT G A 

TGCCTTACACGAGTCTTTAACTGGCTTAGAAAATTTATTATTCTTTGGAA 
AAAT G AAAGG TAT T C AAAAAAC T G AAT T AAAAC AG C AG AT AAC T CAT AT T 
T C T AAAGT AGT AG AT CT AG AAAAC C AAC T T GAT AAAT T T GT C T C AGGT T A 
C T C AG G AG GT AT G AAAAG AC GG CTT T CT C T AG C CAT C GC C C T ACT T G G AA 
ACC C CAC AGT TTT AAT CCTAGATGAACCTACCGTTGG AAT T G AT CC AT CC 
T T G AGG AG AAAAAT C T GG C AAG AG C T AAT T AAT AT T AAG GAT GAAG G AC G 
T T C T AT CT T TAT T AC AAC C C AC GT TAT GG AT GAAG C AG AAT T AAC AAG T A 

AGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACTCCATTA 
CAT T T AAAAAAAC AAT T T AAT GT G AGT AC TAT T GAG GAAG 
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SEQ ID NO. 7608 

STRAIN M781 

AAAAAAGT CAT C GAT T T AAAAAAAC T AC AAAAAG CAT AC 

GCCTCAGAAACTGTTTTAAATAATATTAATTTGGAGGTGTTTAAAGGAGA 
AAT AAT T GG AT T AAT AGGAC C C T C T G GAGC AG G G AAAT CT AC C T T GAT T A 
AAAC TAT G C T T GGC AT G G AAAAAGC AG AT AAGGGAAC AG CTCTTGTTCTT 
GAT ACT C AAAT G C C AGAT C GT AAT AT T TT AAAT C AAAT T G G C TAT AT G GC 
TCAATCTGATGCCTTACACGAGTCTTTAACTGGCTTAGAAAATTTATTAT 
T C T T T G GAAAAAT GAAAG GT AT T C AAAAAACT GAATT AAAAC AG C AG AT A 
ACTCATATTTCTAAAGTAGTAGATCTAGAAAACCAACTTGATAAATTTGT 
C T C AGGT T AC T C AG G AGGT AT G AAAAG ACGGCT T T C T C T AG C CAT C G C CC 
TACT T GG AAAC C C C AC AGT T T T AAT C C TAG AT G AAC C T AC CGT T G G AAT T 
GAT C CAT C C T T GAGGAGAAAAAT CT GG C AAGAGC T AAT T AAT AT T AAGG A 
T G AAGG AC GT T CT AT C T T T AT TAG AAC C C AC GT TAT GG AT G AAG C AG AAT 
TAACAAGTAAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGAT 
AC T C CAT T AC AT T T AAAAAAAC AAT T T AAT G T G AGT ACT AT T G AGG AAGT 
T T T C T T AAAAG C T G AAGG AG AA 

SEQ ID NO. 7609 

STRAIN CJB110 

AAAAAAGT C AT CG AT TT AAAAAAAC T AC AAAAAG CAT AT G 
CCTCAGAAACTGTTTTAAATAATATTAATTTGGAGGTGTTTAAAGGCGAA 
AT AAT T GGAT T AAT AGG AC C C T C T GGAG C AGG GAAAT C T AC C T T G AT T AA 
AAC TAT G C T T GGC AT GG AAAAAG C AG AT AAGG G AAC AGC T CTTGTTCTTG 
AT AC T C AAAT GC C AGAT CGT AAT AT T T T AAAT C AAAT T G GC T AT AT G G C T 
CAATCTGATGCCTTATACGAATCTTTAACTGCCTTAGAAAATTTATTATT 
C T T T G GAAAAAT G AAAG GT AT T C AAAAAACTG AAT T AAAAC AGC AGAT AA 
C T CAT AT T T CT AAAG T AG T AGAT C T AGAAAAC C AACT T GAT AAAT T T G T C 

TCAGGTTACTCAGGAGGTATGAAAAGACGGCTTTCTCTAGCCATCGCCCT 
ACT T GGAAAC C C C AC AGT TT T AAT CCT AGAT GAAC C T AC C GT T GG AAT T G 
AT C CAT C CT T GAG G AG AAAAAT C T GG C AAG AG C T AAT T AAT AT T AAG GAT 
G AAGG AC GT T C T AT CT T TAT TAG AAC C C ACG T TAT GGAT G AAG C AG AAT T 
AAC AAG T AAG GT T G C AC T AC T AT T AC GT GGAAAC AT TAT T G C C T T T GAT A 
C T C CAT T AC AT T T AAAAAAAC AAT T T AAT GT G AG T AC T AT T GAGG AAGT T 
T T CT T AAAAG C T G AAGGAG AA 

SEQ ID NO. 7610 

STRAIN 1169NT 

AAAAAAGT C AT CG AT T T AAAAAAACT AC AAAAAGC AT AC 
G C C T C AGAAACT GTT T T AAAT AAT AT T AAT T T G G AGGT GT T T AAAG G C GA 
AAT AAT T G GAT T AAT AGG AC C CT CT G GAG C AG GG AAAT C T AC C T T GAT T A 
AAAC TAT G CT T G G CAT GG AAAAAG C AG AT AAGGGAAC AG CTCTTGTTCTT 
GAT ACT C AAAT G C C AG AT C GT AAT AT T T T AAAT C AAAT T GG CT AT AT GG C 
TCAATCTGATGCCTTATACGAATCTTTAACTGCCTTAGAAAATTTATTAT 
T CT TTGGAAAAATGAAAGGT ATT CAAAAAACT GAATT AAAACAGCAGAT A 
ACT CAT AT T T C T AAAG T AGT AG AT CT AGAAAAC C AAC T T GAT AAAT T T GT 
C T C AG GT T AC T C AGG AG GT AT G AAAAGAC GGCTTTCT CT AG C CAT CG C C C 
T AC T T GGAAAC C C C AC AGT T T T AAT C CT AG AT GAAC CT AC C GT T G G AAT T 
GAT C C AT C C T T GAGGAGAAAAAT C T GG C AAG AG C T AAT T AAT AT T AAGG A 
T GAAG G AC GT T C TAT C T T TAT T AC AAC C C ACGT TAT G GAT G AAG C AG AAT 
T AAC AAGT AAG G T T G C ACT AC TAT T AC GT G G AAAC AT TAT T G C CT T T GAT 

ACT CC ATT AC ATTT AAAAAAAC AAT TT AATGTG AGT ACT AT TGAGGAAGT 
TTTCTTAAAAGCTGAAGGAGAA 

SEQ ID NO. 7611 

STRAIN JM9130013 

AAAAAAGT CAT CGATTT AAAAAAACT ACAAAAAGC AT ATGCC 

T C AGAAACCGTTTT AAAT AAT AT TAATTTGG AGGT GTTTAAAGGCG AAAT 

AATTGGATTAATAGGACCCTCTGGAGCAGGGAAATCTACCTTGATTAAAA 

CT AT G C T T GG C AT G G AAAAAG C AG AT AAGG GAAC AG CTCTTGTTCTT GAT 

ACTCAAATGCCAGATCGTAATATTTTAAATCAAATTGGCTATATGGCTCA 
ATCTGATGCCTTATACGAGTCTTTAACTGGCTTAGAAAATTTATTATTCT 
TTGGAAAAATGAAAGGTATTCAAAAAACTGAATTAAAACAGCAGATAACT 
CAT ATTT CTAAAGT AGT AG AT CT AGAAAAC CAACTTG AT AAATTTGTCTC 
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AGGTTACTCAGGAGGTATGAAAAGACGGCTTTCTCTAGCCATCGCCCTAC 
T T GGAAACC CCACAGTT T TAAT CCT AGAT GAACCT ACCGTT GGAATT GAT 
C CAT C CT T GAG G AG AAAAAT C T GG C AAG AGCT AAT T AAT AT T AAGG AT GA 
AGGACGT T CT AT CT T TAT T AC AAC C C AC GT T AT GG AT G AAG C AG AAT TAA 
CAAGTAAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACT 
C CAT TAG AT T T AAAAAAAC AAT T T AAT GT GAGT ACT AT T GAGGAAGT T T T 
CTTAAAAGCTGAAGGAGAA 

SEQ ID NO. 7612 
STRAIN 2603 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALYESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGHS I FITTHVMDEAELTSKVALLLRGNI IAFDT PLHLKKQFNV 

SEQ ID NO. 7613 

STRAIN 090 frame: 3 

LKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTALVLDT 
QMPDRNILNQIGYMAQSDALYESLTALENLLFFGKMKGIQKTELKQQITHISKWDLENQ 
LDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKDEGRSI 
FITTHVMDEAELTSKVALLLRGNIIAFDTPLHLKKQFNV 

SEQ ID NO. 7614 

STRAIN A909 frame: 1 

KKVIDLKKLQECAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDHNILNQIGYMAQSDALYESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRS I FITTHVMDEAELTSKVALLLRGNI IAFDTPLHLKKQFNV 

SEQ ID NO. 7615 

STRAIN H3 6B frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALYESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRS I FI TTHVMDE AE LT S KVALLLRGN I IAFDT PLHLKKQFNV 

SEQ ID NO. 7616 

STRAIN 18RS21 frame: 1 

DLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTALVLD 
TQMPDRNILNQIGYMAQSDALYESLTGLENLLFFGKMKGIQKTELKQQITHISKWDLEN 
QLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKDEGHS 
I FI TTHVMDE AE LT S KVALLLRGN I IAFDT P LHLKKQ FN V 

SEQ ID NO. 7617 

STRAIN M7 32 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALHESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRS I FITTHVMDEAELTSKVALLLRGNI IAFDTPLHLKKQFNV 

SEQ ID NO. 7618 

STRAIN COH1 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALHESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRSI FITTHVMDEAELTSKVALLLRGNI IAFDTPLHLKKQFNV [ 

SEQ ID NO. 7619 

STRAIN M7 81 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALHESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRSIFITTHVMDEAELTSKVALLLRGNI IAFDTPLHLKKQFNV 
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SEQ ID NO. 7 620 

STRAIN CJB110 frame: 1 

KBCVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALYESLTALENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRSIFITTHVMDEAELTSKVALLLRGNIIAFDTPLHLKKQFNV 

SEQ ID NO. 7621 

STRAIN 1169NT frame: 1 

KBCVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALYESLTALENLLFFGKMKGIQKTELKQQITHISKVV 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRSIFITTHVMDEAELTSKVALLLRGNIIAFDTPLHLKKQFNV 

SEQ ID NO. 7622 

STRAIN JM9130013 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALYESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRSIFITTHVMDEAELTSKVALLLRGNIIAFDTPLHLKKQFNV 

SEQ ID NO. 7701 
STRAIN 2603 

TTGCCTATGTTGTCTGTTGGTTTAGTTTTAGAGGGTGGCGGAATGAGAGGTCTTTATACT 
GCTGGAGTTTTAGATGCTTTTCTAGATGCAGGAATAAAAATAGATGGTATCGTATCTGTC 
TCTGCTGGTGCATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGA 
T AC AAT AAAAAG TAT T T AT C C C AC C C T AAAT AT AT G AGT C T AAGGT C AT GG T T T C G AAC A 
G GGAAT T T T GT T AAT AAAG AT T T C AC CT AT TAT G AAGT T C CT AT GAAAT T GG AT GT AT T T 
GACGATGAAGCATTTAAAAAAT C AAGT ATT GATTT TT ACGTAGT T G CT AC AGAGAT GAC A 
TCTGGTAAACCTGAATATTTTAAAATTGATAGTGTTTTTGAACAAATGGAAATTTTACGT 
G CT AGT T C AGC AT T AC C AGT AG T C T C AAAG AT G GT T GAT T GG C AGGG GAAAAAGT ACT T A 
GATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGATTTGACAAG 
TTGATTGTTGTGATGACTAGGCCGCTCAATTATCAGAAAAAGCCTTCAAGTGGACGATTG 
T AT AAAAC T CT GT AT AG GAAAT AT C CT AAT T T T GT AAAG AC AG C C T C G AAT CG GT AC C AA 
CAGTATAATAATAGTCTTGAAAAGGTCATGAGCCTTGAAAAAACAGGCGATCTATTTGCA 
AT TAG AC CG AG T AAG AG CT T GG T TAT TGGCCGCT T AGAG AAG AAT C C G G AT AAAC T T GAT 
AGT AT T TAT C AG CT T GGT AT G AAAGAT GC T AAAAG T GT G AT G C C T GAG CT G AAT AGT TAT 
CTAATGAAA 

SEQ ID NO. 7702 

STRAIN 090 

CCTATGTTGTCTGTTGGTTTAGTTTTAG 

AGGGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTT 
CTAGATGCAGGAATAAAAATAGATGGTATCGTATCTGTCTCTGCTGGTGC 
ATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGAT 
ACAATAAAAAGTATTTATCCCACCCTAAATATATGAGTCTAAGGTCATGG 
T T T C G AAC AG G G AAT T T T GT T AAT AAAGAT T T C AC CT AT T AT G AAG T T C C 
TAT GAAAT T G GAT G TAT T T GACG AT G AAG CAT T T AAAAAAT C AAG TAT T G 
AT T T T T ACG T AG T T G CT AC AG AG AT GAC AT C T GG T AAAC C T G AAT AT T T T 
AAAATTGATAGTGTTTTTGAACAAATGGAAATTTTACGTGCTAGTTCAGC 
ATTACCAGTAGTCTCAAAGATGGTTGATTGGCAGGGGAAAAAGTACTTAG 
ATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGA 
TTTGACAAGTTGATTGTTGTGATGACTAGGCCGCTCAATTATCAGAAAAA 
GCCTTCAAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAATT 
T T G T AAAG AC AG C CT C G AAT CGGT AC C AAC AG T AT AAT AAT AG T C T T G AA 
AAG GT CAT GAG C C T T GAAAAAAC AG G C GAT C T AT T T G C AAT TAG AC CG AG 
T AAG AG C T T G GT T AT TGGCCGCT TAG A.GAAG AAT C C GG AT AAACT T GAT A 
GT ATTT AT CAGCTTGGT ATGAAAGAT GCT AAAAGTGTGAT GCCTGAGCT G 
AAT AGTT AT CT AAT GAAA 

) 

SEQ ID NO. 7703 

STRAIN A909 

CCTATGTTGTCTGTTGGTTTAGTTTTAGAG 

GGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTTCT 
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AGAT G C AG G AAT AAAAGT AG AT G GT AT CAT AT CTGTCTCTGCT GGT G CAT 
TGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGATAC 
AATAAAAAGTATTTATCCCACCCTAAATATATGAGTCTAAGGTCATGGCT 
TCGAACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTCCTA 
T G AAAT T GG AT GT AT T T GAC GAT GAAG CAT T T AAAAAAT C AAGT AT T GAT 
T T T T AC GC AGT T GCT AC AG AG AT GAC AT C T GGT AAAC CT G AGT AT T T T AA 
AATTGATAGTGTTTTTGAACAAATGGAAATTTTACGTGCTAGTTCAGCAT 
TACCAGTAGTCTCAAAGAT GGT TGTTTGGCAGGGGAAAAAGT ACT T AGAT 
GGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGATT 
T GAC AAGT T GAT T GT T GT GAT G AC T AGG C C G C T C AAT TAT C AGAAAAAGC 

CTTCAAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAATTTT 
GT AAAGAC AG C CT C G AAC CGG T AC C AAC AGT AT AAT AAT AGC CT T GAAAA 
GGT CAT G AGCC T T GAAAAAAC AG GC G AT CT AT T T G C AAT T AGAC C AAG T A 
AGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGATAGT 

ATTTATCAGCTTGGTATGAAAGATGCTAAAAGTGGGATGCCTGAGCTGAA 
T AGTTAT CT AAT GAAA 

SEQ ID NO. 7704 

STRAIN H36B 

CCTATGTTGTCTGTTGGTTTAGTTTTAG 

AG GGT GGC GG AAT GAGAG GT CT T TAT AC T G CT G G AGT T T TAG AT G CT T T T 
CTAGATGCAGGAATAAAAGTAGATGGTATCATATCTGTCTCTGCTGGTGC 
ATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGAT 
AC AAT AAAAAGT AT T TAT C C C AC C C T AAAT AT AT G AGT CT AAGG T C AT GG 

CTTCGAACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTCC 
TAT GAAAT T GG AT GT AT T T GAC GAT GAAG CAT T T AAAAAAT C AAG TAT T G 

ATTTTTACGCAGTTGCTACAGAGATGACATCTGGTAAACCTGAGTATTTT 
AAAAT T GAT AGT GTT T T T G AAC AAAT GG AAAT T T T AC GT G C T AG T T C AG C 
AT T AC C AGT AG T C T C AAAG AT GGT T GT T T G G C AGG GG AAAAAGT AC T T AG 

ATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGA 
T T T GAC AAG T T GATT GT T GT GAT GAC TAG G C C G C T C AAT TAT C AG AAAAA 
G C C T T C AAGT GG ACG AT T GT AT AAAAC T CT G T AT AG GAAAT AT C C T AAT T 
T T GT AAAG AC AG C CT CG AAC C GG T AC C AAC AGT AT AAT AAT AG C CT T G AA 
AAG G T CAT GAG C CT T GAAAAAAC AG G C GAT C TAT T T G C AA T TAG AC C AAG 
TAAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGATA 

GTATTT AT CAGCTTGGT AT GAAAGATGCT AAAAGT GGGATGCCTGAGCTG 
AATAGTTATCT AAT GAAA 

SEQ ID NO. 7705 

STRAIN 18RS21 

CCTATGTTGTCTGTTGGTTTAGTTTTAGAGG 

GTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTTCTA 
GATGCAGGAAT AAAAAT AGAT GGT ATCGTATCTGTCTCTGCTGGTGCATT 
GTTTGGTGT T AAT T T T GT AT CT AG AC AAC GAGAGAGG G CT T T G C GAT AC A 
AT AAAAAGT AT T T AT C C C AC C C T AAAT AT AT G AGT C T AAGG T CAT G GT T T 
C G AAC AGGG AAT T T T GT T AAT AAAG AT T T C AC CT AT T AT GAAGT T C C T AT 
GAAAT T G GAT GT AT T T GACG AT G AAGC AT T T AAAAAAT C AAGT AT T GAT T 
T T T AC GT AGT T G C T AC AG AG AT GAC AT C T G GT AAAC C T G AAT AT T T T AAA 
AT T GAT AG T G T T T T T G AAC AAAT GG AAAT T T T AC GT G C T AGT T C AG CAT T 
AC C AGT AG T CT C AAAG AT GGT T GAT T G G C AG GGG AAAAAGT AC T TAG AT G 
GTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGATTT 
GAC AAGT T GAT TGTTGT GAT GACTAGGCCGCTC AAT TAT CAGAAAAAGCC 
T T C AAGT GG AC G AT T G TAT AAAAC T C T GT AT AG GAAAT AT C C T AAT T T T G 
T AAAG AC AG C CT CGAAT C G G T AC C AAC AGT AT AAT AAT AGT C T T G AAAAG 
G T CAT GAG C C T T GAAAAAAC AGG C GAT C TAT T T GC AAT TAG AC C G AG T AA 
GAG C T T G GT T AT T GG C C G C T TAG AG AAG AAT C C G GAT AAAC T T GAT AG T A 

TTTATCAGCTTGGTATGAAAGATGCTAAAAGTGTGATGCCTGAGCTGAAT 
AGT T AT CT AAT GAAA 

SEQ ID NO. 7706 

STRAIN M732 

CCTATGTTGTCTGTTGGTTTAGTTTTAGA 

GGGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTTC 
TAG AT G C AG G AAT AAAAAT AG AT G GT AT C GT AT CTGTCTCTGCGGGTG C A 
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TTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGATA 
C AAT AAAAAGT AT T TAT C C C AC C CT G AAT AT AT GAGT C T AAG AT CAT G GC 
TTCGAACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTCCT 
AT G AAAT T GG ATGT AT T T GAC GAT GAAG CAT T T AAAAAAT C AAGT AT T GA 
TTTTTACGTAGTTGCTACAGAGATGACATCTGGTAAACCTGAATATTTTA 
AAAT T GAT AGT GT T T T T GAAC AAAT G G AAAT T T T AC GT GC T AGT T C AG C A 
TTACCAGTAGTCTCAAAGATGGTTGATTGGCAGGGGAAAAAGTACTTAGA 
TGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGAT 
T T GAC AAGT T GAT T GT T GT G AT G AC T AGG C C G CT C AAT TAT C AGAAAAAG 
CCTTCAAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAATTT 
T GT AAAG AC AG C C T C GAAT C GGT AC C AAC AGT AT AAT AAT AG T CT T G AAA 
AGGT CAT GAG C C T T GAAAAAAC AGG CG AT C T AT T T G C AAT T AGAC C GAGT 
AAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGATAG 
TATTTATCAGCTTGGTATGAAATATGCTAAAAGTGTGATGCCTGAGCTGA 
ATAGTTATCTAATGAAA 

SEQ ID NO. 7707 

STRAIN COH1 

CCTATGTTGTCTGTTGGTTTAGTTTTA 

GAGGGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTT 
TCTAGATGCAGGAATAAAAATAGATGGTATCGTATCTGTCTCTGCGGGTG 
CATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGA 
T AC AAT AAAAAG T AT T T AT C C C AC CC T GAAT AT AT GAGT C T AAG AT CAT G 
GCTTCGAACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTC 
C T AT G AAAT T GGAT GT AT T T G ACGAT GAAG CAT T T AAAAAAT C AAGT AT T 
GAT T T T T AC G T AGT T G C T AC AG AG AT GAC AT CT G GT AAAC CT GAAT AT T T 
T AAAAT T GAT AGT GT T T T T GAAC AAAT GG AAAT T T T AC GT G C T AG T T C AG 
C AT T AC C AGT AGT C T C AAAG AT GG T T GAT T GG C AG G G G AAAAAGT ACT T A 
GATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGG 
ATTTGACAAGTTGATTGTTGTGATGACTAGGCCGCTCAATTATCAGAAAA 
AGCCTTCAAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAAT 
TTTGTAAAGACAGCCTCGAATCGGTACCAACAGTATAATAATAGTCTTGA 
AAAGGTCATGAGCCTTGAAAAAACAGGCGATCTATTTGCAATTAGACCGA 
GTAAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGAT 
AGTATTTATCAGCTTGGTATGAAATATGCTAAAAGTGTGATGCCTGAGCT 
GAAT AGT TAT CT AAT GAAA 

SEQ ID NO. 7708 

STRAIN M781 

CCTATGTTGTCTGTTGGTTTAGTTTTAG 

AGGGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTT 
CTAGATGCAGGAATAAAAATAGATGGTATCGTATCTGTCTCTGCGGGTGC 
ATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGAT 
AC AAT AAAAAGT AT T TAT C CC ACC CT GAAT AT AT GAGT C T AAG AT CAT G G 
C T T C GAAC AGG GAAT T T T GT T AAT AAAG AT T T C AC CT AT T AT G AAGT T C C 
TAT G AAAT T G GAT GT AT T T GAC GAT GAAG CAT T T AAAAAAT C AAGT AT T G 
ATTTTTACGTAGTTGCTACAGAGATGACATCTGGTAAACCTGAATATTTT 
AAAATTGATAGTGTTTTTGAACAAATGGAAATTTTACGTGCTAGTTCAGC 
ATTACCAGTAGTCTCAAAGATGGTTGATTGGCAGGGGAAAAAGTACTTAG 
ATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGA 
T T T GAC AAGT T GAT T GT T GT GAT G AC T AGG C C G C T C AAT T AT C AG AAAAA 
GCCTTCAAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAATT 
T T GT AAAG AC AG C CT C GAAT C G GT AC C AAC AGT AT AAT AAT AGT CT T G AA 
AAGG T CAT G AGC CTT GAAAAAAC AG G C GAT C TAT T T G C AAT TAG AC C GAG 
TAAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGATA 
GTATTTATCAGCTTGGTATGAAATATGCTAAAAGTGTGATGCCTGAGCTG 
AAT AGT T AT CT AAT GAAA 

SEQ ID NO. 7709 

STRAIN CJB110 

CCTATGTTGTCTGTTGGTTTAGTTTTA 

GAGGGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTT 
TCTAGATGCAGGAATAAAAATAGATGGTATCGTATCTGTCTCTGCTGGTG 
CATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGA 
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TACAATAAAAAGTATTTATCCCACCCTAAATATATGAGTCTAAGGTCATG 
GTTTCGAACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTC 
CTATGAAATTGGATGTATTTGACGATGAAGCATTTAAAAAATCAAGTATT 
GAT T T T T ACGT AGT T G CT AC AG AG AT GAC AT CT GGT AAAC CT G AAT AT T T 
TAAAATTGATAGTGTTTTTGAACAAATGGAAATTTTACGTGCTAGTTCAG 
CATTACCAGTAGTCTCAAAGATGGTTGATTGGCAGGGGAAAAAGTACTTA 
GATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGG 
AT T T G AC AAGT T GAT T GT TGT G AT GAC T AGG C C G CT C AAT TAT C AG AAAA 
AG C CT T C AAGT G G ACG AT TGT AT AAAACT CT GT AT AG G AAAT AT C CT AAT 
T T T GT AAAG AC AG C CT CG AAT GG GT AC CAAC AGT AT AAT AAT AGT CT T G A 
AAAGGT CAT GAG C CT T G AAAAAAC AG G C GAT CT AT T T GC AAT TAG AC C G A 
GTAAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGAT 
AG TAT T TAT C AG C T T GGT AT G AAAGAT G C T AAAAGT G T GAT G C C T G AGCT 
GAATAGT TAT CT AATGAAA 

SEQ ID NO. 7710 

STRAIN 1169NT 

CCTATGTTGTCTGTTGGTTTAGTTTTAGAGGGTG 

GC GGAAT G AGAGGT C T T TAT AC T G C T G G AGT T T T AGAT G CT T T T C T AGAT 
G C AG G AAT AAAAAT AG AT GGT AT C GT AT CTGTCTCTGCGGGTG CAT T GT T 
T GG T GT T AAT T T T GT AT C T AGAC AACG AG AG AGG G C T T T G CG AT AC AAT A 
AAAAGT AT T TAT C CC AC C CT AAAT AT AT G AGT C T AAG AT C AT GG CT T C G A 
ACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTCCTATGAA 
AT T G G AT GT AT T T G ACG AT G AAG CAT T T AAAAAAT C AAGT AT T GAT T T T T 
ACG C AGT T G CT AC AG AG AT GACAT C T G G T AAAC C T G AAT AT T T T AAAAT T 
GATAGTGTCTTTGAACAAATGGAAATTTTACGTGCTAGTTCAGCATTACC 
AGT AGT C T C AAAG AT GG T T GAT T GG C AGG G G AAAAAGT ACT T AG AT GG T G 
GTTTATCTGATAGTATCCCCGTTGATTTTGCCCGTGGTTTAGGATTTGAC 
AAGTTGATTGTTGTGATGACTAGGCCGCTCAATTATCAGAAAAAGCCTTC 
AAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAATTTTGTAA 
AGAC AGCCT CGAAT C GGT ACCAACAGT AT AAT AAT AG CCTT GAAAAGGT C 
ATGAGCCTTGAAAAAACAGGCGATCTATTTGCAATTAGGCCGAGTAAAAG 
CT T G G T TAT TGTCCGCT T AGAG AAG AAT C C GGAT AAAC T T GAT AGT AT TT 
AT C AG C T T G GT AT G AAAG AT G C T AAAAGT G T GAT G C C T G AGC T GAATAGT 
TAT CT AATGAAA 

SEQ ID NO. 7711 

STRAIN JM9130013 

CCTATGTTGTCTGTTGGTTTAGTTTTAGAG 

GGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTTCT 
AGATGCAGGAATAAAAGTAGATGGTATCATATCTGTCTCTGCTGGTGCAT 
TGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGATAC 
AATAAAAAGTATTTATCCCACCCTAAATATATGAGTCTAAGGTCATGGCT 
TCGAACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTCCTA 
T G AAAT T GGAT GT AT T T GAC GAT G AAGC AT T T AAAAAAT C AAG TAT T GAT 
T T T T AC G C AG T T GCT AC AG AG AT GACAT CT GGT AAAC C T GAG TAT T T T AA 
AATTGATAGTGTTTTTGAACAAATGGAAATTTTACGTGCTAGTTCAGCAT 
T AC C AGT AGT C T C AAAG AT GGT T G T T T GG C AG GG G AAAAAGT ACT TAG AT 
GGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGATT 
T GAC AAGT T GAT T G T T G T GAT GAC TAG G CC G CT C AAT TAT C AG AAAAAG C 
CTTCAAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAATTTT 
GT AAAG AC AG C C T CG AAC C GGT AC CAAC AGT AT AAT AAT AG CCT T G AAAA 
GGT CAT GAG CCTT G AAAAAAC AG G C GAT CT AT T T G C AAT TAG AC C AAGT A 
AGAG C T T G GT T AT TGGCCGCT TAG AG AAG AAT C CG GAT AAAC T T GAT AG T 
AT T TAT C AG CT T GGT AT G AAAG AT G CT AAAAG T G G GAT G C C T GAG C T G AA 
T AGT TAT CT AATGAAA 

SEQ ID NO. 7712 
STRAIN 2603 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWFRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYVVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPVVSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSVMPELNSYLMK 
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SEQ ID NO. 7713 

STRAIN 090 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWFRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYWATEMTS 
GKPEYFKIDSVFEQMEILRAS SALPWSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
I WMT R P LN YQKK P S S GRL YKT L YRK Y PN FVKT A S N R YQQ YNN S LE KVM S LE KT G D L FAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSVMPELNSYLMK 

SEQ ID NO. 7714 

STRAIN A909 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKVDGIISVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYAVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMWWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IVVMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSGMPELNSYLMK 

SEQ ID NO. 7715 

STRAIN H3 6B frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKVDGIISVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYAVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMWWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSGMPELNSYLMK 

SEQ ID NO. 7716 

STRAIN 18RS21 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWFRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYWATEMTS 
GKPEYFKIDSVFEQMEILRAS SALPWSKMVDWQGKKYLDGGLSDS I PVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSVMPELNSYLMK 

SEQ ID NO. 7717 

STRAIN M732 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPEYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYVVATEMTS 
GKPEYFKIDSVFEQMEILRAS SALPWSKMVDWQGKKYLDGGLSDS I PVDFARGLGFDKL 
IVVMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKYAKSVMPELNSYLMK 

SEQ ID NO. 7718 

STRAIN COH1 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPEYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYWATEMTS 
GKPEYFKI DS VFEQMEILRAS S ALP WSKMVDWQGKKYL DGGLS DS I PVDFARGLGFDKL 
' IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKYAKSVMPELNSYLMK 

SEQ ID NO. 7719 

STRAIN M781 frame: 1 

PML S VG L VLE G G GMRG L YT AG V L DA F L DAG IKIDGIVSVS AG AL FG VN F V S RQRE RALR Y 
NKKYLSHPEYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYWATEMTS 
GKPEYFKI DSVFEQMEILRASSALPVVSKMVDWQGKKYLDGGLSDSI PVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKYAKSVMPELNSYLMK 

SEQ ID NO. 7720 

STRAIN CJB110 frame: 1 

PML SVGLVLEGGGMRGLYT AG VLDAFL DAG IKIDGIVSVS AG ALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWFRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYVVATEMTS 
GKPEYFKI DSVFEQMEILRASSALPVVSKMVDWQGKKYLDGGLSDSI PVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSVMPELNSYLMK 
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SEQ ID NO. 7721 

STRAIN JM9130013 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKVDGIISVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYAVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMWWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSGMPELNSYLMK 

SEQ ID NO. 7722 

STRAIN 1169NT frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYAVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIVRLEKNPDKLDSIYQLGMKDAKSVMPELNSYLMK 

SEQ ID NO. 7801 
STRAIN 2603 

AT G AAAGT T T T AGT AG T T GAT GAT GAAC C AG T T G C AC GT AAC G AAT T AAT T TAG C TT C T T 
AATAAGTATGATTCTAACCTCGTTATAGCAGAGGCGCATGATATGGCTACTGCATTAGCT 
AT T T T AC T T AGAG AAACT T T T GAT GT AG C AC T GT T AGAT AT C CAT CT C AG AGAT GAT T C T 
GGGTTGCAATTAGCAGAGTATATCAATAAAATGCCCAAACCACCATTATTGATATTTGCG 
ACTGCTTATGATCAATATGCTATTCAGGCTTTTGAGCATGATGCGCGTGATTATTTGTTA 
AAAC C C T AT GAT T T T G AT AG GC T AAAG C AAG CT AT GG AT AG AG T AAAAG GAG C GC T AAGT 
AC AT CT AC AAT TAT AGAG AG C G T AACT TCCGGTCCTCTCTT C AAG C AAC AGT AT C CAT T G 
ACAGTAGAAGATCGAATCTATCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATG 
C AAG G AAAAC T GAT TAT AC AAAC AC CT G AT AAAAAT TAT G AAAT T GAT GG C T C T C T AC AA 
CAATGGCAAGATAAACTACCATCATCTCAATTTGTACGGGTACATCGCTCTTACATTGTG 
AAC AT T AAT G C T AT T AAAAC GAT T GAAC C T T G GTT T AAC C AAAC AC T T C AGT T AC AC C T T 
T GT AAT AAAAT AAC AGT T C CT GT T AGC AG AG C AAAT GT AAAAC C C C T AAAAC AAAT GT T A 
GGCATATCTACC 

SEQ ID NO. 7802 

STRAIN 090 

AAAG T T T T AGT AGT T GAT GAT G AAC C AGT T G C AC GT AA 
CGAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGTTATAGCAG 
AGG C GC AT G AT AT GG C T AC T GC AT T AG C TAT T T T AC T T AGAG AAAC T T T T 
G AT GT AG C ACT GT TAG AT AT C CAT C T C AGAG AT GAT TCTGGGTTG C AAT T 
AG C AG AGT AT AT C AAT AAAAT G CC C AAAC C AC C AT TAT T GAT AT T T GC G A 
C T G CT TAT GAT C AAT AT G C TAT T C AGG C T T T T GAG CAT GAT G C G CGT G AT 
TAT T T GT T AAAAC CC TAT GAT T T T GAT AG G C T AAAGC AAG C T AT GG AT AG 
AGT AAAAG G AGC G CT AAGT AC AT C T AC AAT TAT AGAG AG C GT AACT T C C G 
GTCCTCTCTT C AAG C AAC AGT AT C CAT T G AC AGT AG AAG AT C G AAT CT AT 
CTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAACT 
GAT TAT AC AAAC AC C T GAT AAAAAT TAT GAAAT T GAT GG C T C T C T AC AAC 
AATGGCAAGATAAACTACCATCATCTCAATTTGTACGGGTACATCGCTCT 
TACATTGTGAACATTAATGCTATTAAAACGATTGAACCTTGGTTTAACCA 
AAC AC T T C AGT T AC AC C T T T GT AAT AAAAT AAC AG T T C C T G T T AG C AG AG 
CAAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7803 

STRAIN A909 

AAAGTTTTAGTAGTTGATGATGAACCAGTTGCACGTAAC 
GAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGTTATAGCAGA 
GGCGCATGATATGGCTACTGCATTAGCTATTTTACTTAGAGAAACTTTTG 
AT GT AG C ACT GT TAG AT AT C CAT C T C AG AG AT GAT TCTGGGTTG C AAT T A 
G C AG AG TAT AT C AAT AAAAT GC C C AAAC C AC CAT TAT T GAT AT T C G C G AC 
TGCTTATGATCAATATGCTATTCAAGCTTTTGAGCATGATGCGCGTGATT 
AT T T GT T AAAAC C CT AT GAG T T T GAT AGG C T AAAG C AAG C T AT G GAT AG A 
GT AAAAG GAG C G C T AAG T AC AT CT AC AAT T AT AG AG AG C GT AACT T C C G G 
CCCTCTCTT C AAG C AAC AG TAT C C AT T G AC AGT AGAAG AT C G AAT C T AT C 
TGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAACTG 
ATTATACAAACACCTGATAAAAATTATGAAATTGATGGCTCTCTACAACA 
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AT G G C AAG AT AAAC T AC CAT CAT C T C AAT T T GT AC GG GT G C AC C G CT C T T 
AC AT T G T G AAT AT T AAT G CT AT T AAAAC GAT T G AAC CT T GG T T T AAC C AA 
AC AC T T C AGT T AC AC C T T T GT AAT AAAAT AAC AGT T C C T GT TAG C AG AGC 
AAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7804 

STRAIN H36B 

AAAGT T T T AGT AGT T GAT GAT GAAC C AGT T GC ACGT 

AACGAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGTTATAGC 
AGAG G C G CAT GAT AT G GCT AC T G CAT TAG C T AT T T TAG T TAG AG AAAC T T 
T T GAT GT AG C ACT GT TAG AT AT C CAT CT C AG AG AT GAT TCTGGGTTG C AA 
T T AGC AGAGT AT AT C AAT AAAAT GC C C AAAC C AC CAT TAT T GAT AT T C G C 
GACTGCTTATGATCAATATGCTATTCAAGCTTTTGAGCATGATGCGCGTG 
AT TAT T T GT T AAAAC C CT ATG AGT T T GAT AGGCT AAAG C AAG CT AT G GAT 
AGAGTAAAAGGAGCGCTAAGTACATCTACAATTATAGAGAGCGTAACTTC 
CGGCCCTCTCTTCAAGCAACAGTATCCATTGACAGTAGAAGATCGAATCT 
ATCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAA 
C T G AT T AT AC AAAC AC CT G AT AAAAAT T AT G AAAT T GAT G G C T CT CT AC A 
AC AAT G GC AAG AT AAAC TAG CAT CAT C T C AAT T T GT ACGG G T G C AC C G CT 
C T T AC AT T GT G AAT AT T AAT G C T AT T AAAACGAT T G AAC CT T G GT T T AAC 
CAAAC ACT TCAGTTACACCTTTGTAAT AAAAT AACAGTTCCTGTTAGCAG 
AGCAAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7805 

STRAIN 18RS21 

AAAGTTT T AGT AGTT G AT GAT GAAC C AGT T G C ACGT AAC 

GAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGTTATAGCAGA 

GGCGCATGATATGGCTACTGCATTAGCTATTTTACTTAGAGAAACTTTTG 

ATGTAGCACTGTTAGATATCCATCTCAGAGATGATTCTGGGTTGCAATTA 

G C AG AGT AT AT C AAT AAAAT G C C CAAAC C AC CAT TAT T GAT AT T T GC G AC 

TGCTTATGATCAATATGCTATTCAGGCTTTTGAGCATGATGCGCGTGATT 

AT T T GT T AAAAC C CT AT GAT T T T GAT AG G CT AAAGC AAG CT AT G GAT AG A 

GTAAAAGGAGCGCTAAGTACATCTACAATTATAGAGAGCGTAACTTCCGG 

TCCTCTCTT C AAG C AAC AGT AT C C AT T G AC AGT AG AAG AT C GAAT C T AT C 

TGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAACTG 

ATT AT AC AAAC AC CT GAT AAAAAT TAT GAAAT T GAT GGCT CT CT AC AAC A 

ATGGCAAGATAAACTACCATCATCTCAATTTGTACGGGTACATCGCTCTT 

AC AT T GT GAAC AT T AAT G CT AT T AAAAC GAT T GAAC CT T GGT T T AAC C AA 

AC ACT T C AGT T AC AC C T T T GT AAT AAAAT AAC AGT T CCT GT T AG C AG AGC 

AAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7806 

STRAIN M732 

AAAGT T TT AGT AGTT GAT G ATG AAC C AGTT 

GCACGTAACGAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGT 
TAT AG C AG AG G CG C AT GAT AT G G CT AC T G CAT TAG C T AT T T TACT TAG AG 
AAACTT T T GAT GT AG C AC T GT T AG AT AT C CAT CT C AG AG AT GAT T CT G GG 
T T G C AAT T AGC AG AGT AT AT C AAT AAAAT G C C C AAAC C AC CAT TAT T GAT 
AT T C G C G AC T G CT TAT GAT C AAT AT G C T AT T C AGG CT T T T G AG C AGG AT G 
CGCGTGATTATTTGTTAAAACCCTATGAGTTTGATAGGTTAAAGCAAGCT 
AT G G AT AG AG T AAAAG GAG C G C T AAG T AC AT C T AC AAT TAT AG AG AG C GT 
AGC T T C C GGT CCTCTCTT C AAGC AAC AG T AT CC AT T G AC AG TAG AAG AT C 
GAATCTATCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAA 
G G AAAAC T GAT TAT AC AAAC AC C T GAT AAAAAT TAT GAAAT T G AT GGC T C 
TCTACAACAATGGCAAGATAAA.CTACCATCATCTCAATTTGTACGGGTAC 
ATCGCTCTTACATTGTGAATATTAATGCTATTAAAACGATTGAACCTTGG 
TTTAACCAAACACTTCAGTTACACCTTTGTAATAAAATAACAGTTCCTGT 
TAGCAGAGCAAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7807 

STRAIN COH1 

AAAG T T T T AG TAG T T GAT GAT GAAC C AG T T G C AC G T A 

ACGAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGTTATAGCA 

GAGGCGCATGATATGGCTACTGCATTAGCTATTTTACTTAGAGAAACTTT 
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T G AT GT AG C AC T G T T AGAT AT C CAT CT C AG AG AT GAT TCTGGGTTG C AAT 
TAGCAGAGTATATCAATAAAATGCCCAAACCACCATTATTGATATTCGCG 
ACTGCTTATGATCAATATGCTATTCAGGCTTTTGAGCAGGATGCGCGTGA 
TTATTTGTTAAAACCCTATGAGTTTGATAGGTTAAAGCAAGCTATGGATA 
GAGTAAAAGGAGCGCTAAGTACATCTACAATTATAGAGAGCGTAGCTTCC 
GGTCCTCTCTT C AAGC AAC AGT AT C CAT T G AC AGT AGAAGAT C G AAT C T A 
TCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAAC 
T GAT T AT AC AAAC AC C T G AT AAAAAT T AT G AAAT T GAT G GCT C T C T AC AA 
C AAT GG C AAG AT AAAC T AC CAT CAT CT C AAT T T GT AC GG GT AC AT CG C T C 
T T AC AT T G T G AAT AT T AAT GC T AT T AAAACG AT T G AAC C T T GG T T T AAC C 
AAACACTTCAGTTACACCTTTGTAATAAAATAACAGTTCCTGTTAGCAGA 
GCAAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7808 

STRAIN M781 

AAAGTTTTAGTAGTTGATGATGAACCAGTTGCACGTAAC 

G AAT T AAT T T AC C T T CT T AAT AAGT AT GAT T CT AAC CT C GT T AT AGC AG A 

GGCGCATGATATGGCTACTGCATTAGCTATTTTACTTAGAGAAACTTTTG 

ATGTAGCACTGTTAGATATCCATCTCAGAGATGATTCTGGGTTGCAATTA 

G C AG AGT AT AT C AAT AAAAT G C C C AAAC C AC CAT TAT T GAT AT T C GC G AC 

T GC T T AT GAT C AAT AT G CT AT T C AG G CT T T T GAG C AGG AT GCG C GT GAT T 

AT T T GT T AAAAC C CT AT GAGT T T GAT AG G T T AAAG C AAG CT AT GGAT AG A 

GTAAAAGGAGCGCTAAGTACATCTACAATTATAGAGAGCGTAGCTTCCGG 

TCCTCTCTT C AAGC AAC AGT AT C CAT T G AC AGT AG AAG AT C G AAT CT AT C 

TGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAACTG 

ATTATACAAACACCTGATAAAAATTATGAAATTGATGGCTCTCTACAACA 

ATGGCAAGATAAACTACCATCATCTCAATTTGTACGGGTACATCGCTCTT 

ACATTGTGAATATTAATGCTATTAAAACGATTGAACCTTGGTTTAACCAA 

ACACTTCAGTTACACCTTTGTAATAAAATAACAGTTCCTGTTAGCAGAGC 

AAAT G T AAAAC CC C T AAAAC AAAT G T T AG G CAT AT CT AC C 

SEQ ID NO. 7809 

STRAIN CJB110 

CTTAATAAGTATGATTCTAACCTCGTTATAGCAGAGGCGCATGATATGGC 
TACT GC AT T AG CT AT T T T AC T TAG AG AAAC T T T T GAT GT AG C AC T GT TAG 
AT AT C CAT C T C AGAG AT GAT TCTGGGTTG C AAT TAG C AG AGT AT AT C AAT 
AAAATGCCCAAACCACCATTATTGATATTCGCGACTGCTTATGATCAATA 
TGCTATTCAAGCTTTTGAGCATGATGCGCGTGATTATTTGTTAAAACCCT 
ATGAGTTTGATAGGCTAAAGCAAGnTATGGATAGAGTAAAAGGAGCGCTA 
AGTACATCTACAATTATAGAGAGCGTAACTTCCGGCCCTCTCTTCAAGCA 
AC AGT AT C CAT T G AC AGT AG AAG AT nG AAT CT AT CT G GT GT C G G C GG AT G 
AT AT C CT T T T GAT T G AAG C T AT G C AAG G AAAAC T GAT TAT AC AAAC AC CT 
GAT AAAAAT TAT G AAAT T GAT G G C T CT C T AC AAC AAT GG C AAGAT AAAC T 
ACCATCATCTCAATTTGTACGGGTGCACCGCTCTTACATTGTGAATATTA 
AT G C TAT T AAAAC GAT T G AAC C T T GGT T T AAC C AAAC AC T T C AGT T AC AC 
CT T T GT AAT AAAAT AAC AG T T C C T GT TAG C AG AG C AAAT GT AAAAC C C C T 
AAAAC AAAT GTT AGG 

SEQ ID NO. 7810 

STRAIN 1169NT 

AAAGT TT T AGT AGTT G AT GAT GAACCAG 

TTGCACGTAACGAATTAATTTATCTTCTT AAT AAGT ATGATTCTAACCTC 
GTTATAGCAGAGGCGCATGATATAGCTACTGCATTAGCTATTTTACTTAG 
AG AAAC T T T T GAT G T AG C AC T G T TAG AT AT C CAT CT C AG AG AT GAT T C T G 
G G T T G C AAT TAG C AG AG TAT AT C AAT AAAAT G CC C AAAC C AC CAT TAT T G 
AT&TTCGCGACTGCTTATGATCAATATGCTATTCAGGCTTTTGAGCATGA 
TGCGCGTGATTATTTGTTAAAACCCTATGAGTTTGATAGGCTAAAGCAAG 
C T AT G GAT AG AG T AAAAG GAG C G C T AAG T AC AT C T AC AAT TAT AGAG AGC 
G T AAC TTCCGGCC CT C T C T T C AAG C AAC AG TAT C CAT T G AC AGT AG AAG A 
TCGAATCTATCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGC 
AAG GAAAAC T GAT T AT AC AAAC AC C T GAT AAAAAT TAT G AAAT T GAT G GC 
T C T CT AC AAC AAT GG C AAG AT AAACT AC C AT C AT C T C AAT TTGTACGGGT 
G C AC C GCT C T T AC AT T G T G AAT AT T AAT G C TAT T AAAAC GAT T G AAC C T T 
GGTTTAACCAAACACTTCAGTTACACCTTTGTAATAAAATAACAGTTCCT 
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GT T AG C AG AG C AAAT GT AAAAC C C CT AAAAC AAAT GT T AGG C AT AT C T AC 

C 

SEQ ID NO. 7811 

STRAIN JM9130013 

AAAGT T T T AGT AGT T GAT GAT G AAC C AGT 

TGCACGTAACGAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCG 
T TAT AG C AG AGG C G CAT GAT AT GG C T AC T G CAT TAG C T ATT T TACT TAG A 
G AAAC T T T T GAT GT AG C AC T GT T AGAT AT C C AT CT C AGAG AT GAT T C T GG 
GTTGCAATT AGCAGAGT AT AT CAATAAAAT GCCCAAACC AC CATT ATT GA 
TATTCGCGACTGCTTATGATCAATATGCTATTCAAGCTTTTGAGCATGAT 
GCGCGTGATTATTTGTTAAAACCCTATGAGTTTGATAGGCTAAAGCAAGC 
T ATGGAT AGAGTAAAAGGAG CGCTAAGT ACAT CT ACAAT T AT AGAGAGCG 
T AACT TCCGGCCCTCT CT T C AAG C AAC AGT AT C C AT T G AC AGT AG AAGAT 
CGAATCTATCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCA 
AGGAAAACTGATTATACAAACACCTGATAAAAATTATGAAATTGATGGCT 
CT CT AC AAC AAT GG C AAG AT AAAC T AC CAT CAT C T C AAT TTGTACGGGTG 
CACCGCTCTTACATTGTGAATATTAATGCTATTAAAACGATTGAACCTTG 
GT T T AAC C AAAC AC T T C AG T T AC AC C T T T GT AAT AAAAT AAC AG T T C C T G 
TTAGCAGAGCAAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7812 
STRAIN 2 603 frame: 1 

KVLWDDE PVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 

LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYDFDRLKQAMDRVKGALST 

STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 

WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7813 

STRAIN 0 90 frame: 1 

KVLWDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYDFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7814 

STRAIN A90 9 frame: 1 

KVLVVDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7815 

STRAIN H3 6B frame: 1 

KVLWDDE PVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7816 

STRAIN 18RS21 frame: 1 

KVLVVDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYDFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSEIANVKPLKQMLG 
1ST 

SEQ ID NO. 7817 

STRAIN M7 32 frame: 1 

KVL V V DDE P V ARNE L I Y L LNK Y DSN L V I AE AH DMAT AL A ILL RE T F D VAL LDIHLRDDSG 
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LQLAEYINKMPKPPLLIFATAYDQYAIQAFEQDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVASGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7818 

STRAIN COH1 frame: 1 

KVLWDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEQDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVASGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 

WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7819 

STRAIN M781 frame: 1 

KVLWDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEQDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVASGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7820 

STRAIN CJB110 frame: 1 

LNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSGLQLAEYINKMPKPPLLIF 
ATAYDQYAIQAFEHDARDYLLKPYEFDRLKQXMDRVKGALSTSTIIESVTSGPLFKQQYP 
LTVEDXIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQWQDKLPSSQFVRVHRSYI 
VNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQML 

SEQ ID NO. 7821 

STRAIN 1169NT frame: 1 

KVLVVDDEPVARNELIYLLNKYDSNLVIAEAHDIATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7822 

STRAIN JM9130013 frame: 1 

KVLWDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 

1ST 

SEQ ID NO. 7 901 
STRAIN 2603 

ATGGGAATTGAATTTAAAAATGTAAGTTATACCTATCAAGCCGGCACTCCTTTTGAAGGG 
CGTGCCCTTTTTGACGTCAATCTGAAAATTGAAGATGCTTCCTATACCGCGTTCATTGGG 
C AC AC AG GT T CT G G AAAAT C AAC T AT TAT G C AAC T T T T G AAT G G T T T AC AT AT T C C T AC A 
AAAGGTGAGGTAATTGTCGATGATTTTTCTATTAAAGCAGGGGACAAGAACAAAGAAATC 
AAATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCAGAAAGTCAGCTTTTTGAA 
GAGACAGTTTTAAAGGATGTTGCTTTTGGACCACAAAATTTTGGTATTTCTCAGATTGAA 
G CT G AAAG G C T GG CT G AAG AAAAAT T AAGG T T AGT T GGT AT C AG T GAG GAT T T AT T CG AT 
AAAAATCCATTTGAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTA 
G C GAT GG AAC C C AAAG T AC T AGT AC T GG AT GAG C CAAC AG CT G G AC T T GAT C CT AAG G G A 
AGAAAAGAATTAATGACTCTTTTTAAAAATCTTCATAAAAAAGGAATGACTATCGTCTTA 
GT G AC T C ACT T AAT GG ACG AT GT AG C GG AT TAT G C T G AC TAT G T GT AT G T T T TAG AAG C A 
GGGAAAGTAACCTTATCAGGACAACCAAAACAGATTTTTCAAGAAGTAGAACTTTTAGAA 
AGT AAAC AAT TAG G AGT T C C C AAAAT C AC C AAG T TT G CT C AAAG ACT AT CT CAT AAG GG A 
TTAAATTTACCTAGTTTACCAATTACTATTAACGAATTTGTGGAGGCTATTAAGCATGGA 

SEQ ID NO. 7902 
STRAIN 090 

G G AAT T G AAT T T AAAAAT GTAAGT TAT AC C T AT C AAG C C 
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GGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATCTGAAAATTGA 
AGATGCTTCCTATACCGCGTTCATTGGGCACACAGGTTCTGGAAAATCAA 
CTATTATGCAACTTTTGAATGGTTTACATATTCCTACAAAAGGTGAGGTA 
AT T GT CG AT GAT T T T T CT AT T AAAG CAGGGGAC AAG AAC AAAG AAAT C AA 
ATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCAGAAAGTCAGC 
TTTTTGAAGAGACAGTTTTAAAGGATGTTGCTTTTGGACCACAAAATTTT 
GGT AT T T C T C AG AT T G AAGC T GAAAGG CT GG CT G AAGAAAAAT T AAG G T T 
AGTTGGTATCAGTGAGGATTTATTCGATAAAAATCCATTTGAACTTTCTG 
GAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGCGATGGAACCC 
AAAGTACTAGTACTGGATGAGCCAACAGCTGGACTTGATCCTAAGGGAAG 
AAAAGAAT T AAT G ACT CT T T T T AAAAAT CT T C AT AAAAAAGG AAT G ACT A 
TCGTCTTAGTGACTCACTTAATGGACGATGTAGCGGATTATGCTGACTAT 
GT GT AT GT T T TAG AAG C AGGG AAAGT AAC CT TAT C AGG AC AAC C AAAAC A 
G ATT T T T C AAG AAGT AGAACT T T T AG AAAGT AAAC AAT T AGG AGT T C C C A 
AAAT C AC C AAGT T T G C T C AAAG AC TAT CT CAT AAG G G AT T AAAT T T AC C T 
AGT T T AC C AAT T AC TAT T AAC G AAT TT GT GGAG G C TAT T AAG CAT G G A 

SEQ ID NO. 7903 

STRAIN A90 9 

GGAATTGAATTT AAAAAT GTAAGTT AT ACCTATCAA 

GCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATCTGAAAAT 
T G AAGAT GCT T C CT AT AC C G C GT T CAT T G GG C AC AC AG GT T CT G GAAAAT 
C AACT AT T AT GC AAC T T T T G AAT GGT T T AC AT AT T C CT AC AAAAGG T GAG 
GTAATTGTCGATGATTTTTCTATTAAAGCAGGGGACAAGAACAAAGAAAT 
CAAATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCAGAAAGTC 
AG CT T T T T GAAG AG AC AG T T T T AAAAGAT GT TG CT T T T G GAC C AC AAAAT 
T T T GGT AT T T C T C AG AT T GAAG CT GAAAGG CT GG C T G AAG AAAAAT T AAG 
G T T AGT T G GT AT C AG T GAGGAT T TAT T C GAT AAAAAT C CAT T T GAAC T T T 
CTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGCGATGGAA 
CC C AAAGT AC T AGT AC T AGAT G AGC C AAC AGCT GGAC T T GAT C C T AAGGG 
AAGAAAAGAAT T AAT G ACT CT TT TT AAAAAT CTT C ATAAAAAAGGAATGA 
CTATCGTCTTAGTGACTCACTTAATGGACGATGTAGCGGATTATGCTGAC 
TAT GT GT AT G T T T TAG AAGC AGG G AAAGT AAC C T TAT C AG GAC AAC C AAA 
G C AG AT T T T T C AAG AAGT AGAAC T T T TAG AAAGT AAAC AAT T AGG AG T T C 
CCAAAATCACCAAGTTTGCTCAAAGGCTATCTCATAAGGGATTAAATTTA 
CCTAGTTTACCAATTACTATTAACGAATTTGTGGAGGCTATTAAGCATGG 
A 

SEQ ID NO. 7904 

STRAIN H3 6B 

GGAATTGAATTT AAAAAT GTAAGTT AT AC 

CTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATC 
TGAAAATTGAAGATGCTTCCTATACCGCGTTCATTGGGCACACAGGTTCT 
G GAAAAT C AACT AT TAT G C AACT T T T G AAT GGT T T AC AT ATT C CT AC AAA 
AGGTGAGGTAATTGTCGATGATTTTTCTATTAAAGCAGGGGACAAGAACA 
AAG AAAT C AAAT T T AT AAGG C AAAAAG T T GGT T TAG T T T T T C AAT T T C C A 
G AAAGT C AG CT T T T T G AAG AGAC AGT T T T AAAAGAT GTTGCTTTTG GAC C 
ACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAA 
AAT T AAGGT T AGT T GGT AT C AGT GAGGAT T T AT T CGAT AAAAAT C CAT T T 
GAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGC 
GAT G GAAC C C AAAG T AC T AGT AC TAG AT GAG C C AAC AG C T GGAC T T GAT C 
CT AAG G GAAG AAAAGAAT T AAT G AC T CT T T T T AAAAAT CTT C AT AAAAAA 
GGAATGACTATCGTCTTAGTGACTCACTTAATGGACGATGTAGCGGATTA 
T GC T GAC TAT GT GT AT GT T T T AGAAG C AGG GAAAGT AAC C T TAT C AGG AC 
AACCAAAGCAGATTTTTCAAGAAGTAGAACTTTTAGAAAGTAAACAATTA 
GGAGTTCCCAAAATCACCAAGTTTGCTCAAAGGCTATCTCATAAGGGATT 
AAATTTACCTAGTTTACCAATTACTATTAACGAATTTGTGGAGGCTATTA 
AGCATGGA 

SEQ ID NO. 7905 

STRAIN 18RS21 

GGAATTGAATTT AAAAAT GTAAGTT AT AC 

CTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATC 
TGAAAATTGAAGATGCTTCCTATACCGCGTTCATTGGGCACACAGGTTCT 
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GGAAAATCAACTATTATGCAACTTTTGAATGGTTTACATATTCCTACAAA 
AGGT GAGGT AAT TGT CG AT GAT T T T T CT AT T AAAG C AGG G G AC AAGAAC A 
AAGAAAT CAAATTT ATAAGGCAAAAAGTTGGTTT AGT TTTT CAATTT CCA 
GAAAGTCAGCTTTTTGAAGAGACAGTTTTAAAGGATGTTGCTTTTGGACC 
ACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAA 
AAT T AAGGT T AGT T GGT AT C AGT GAG GAT T TAT T CGAT AAAAAT CC AT T T 
GAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGC 
GAT G G AAC C C AAAGT ACT AGT AC T GGAT GAG CCAAC AG CT GGAC T T GAT C 
CT AAG GGAAG AAAAG AAT T AAT G AC T CT T T T T AAAAAT CT T C AT AAAAAA 
GGAATGACTATCGTCTTAGTGACTCACTTAATGGACGATGTAGCGGATTA 
T GC T G AC T AT GT G T AT GT T T TAG AAG C AGG G AAAGT AACCT TAT C AG G AC 
AACCAAAACAGATTTTTCAAGAAGTAGAACTTTTAGAAAGTAAACAATTA 
GG AGT T C C C AAAAT C AC C AAGT T T G CT C AAAG ACT AT C T CAT AAG G GAT T 
AAAT T T ACC T AGT T T AC C AAT TACT AT T AAC G AAT T T GT GG AGG CT AT T A 
AGCATGGA 

SEQ ID NO. 7906 

STRAIN M732 

GGAATTGAATT T AAAAAT GT AAGT TAT AC 

CTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATC 
TGAAAATTGAAGATGTTTCCTATACCGCGTTCATTGGGCACACAGGTTCT 
GGAAAAT C AAC TAT TAT G C AAC TTTT GAAT GGT T T AC AT AT T C C TAG AAA 
AG GT G AGG T AAT TGT CGAT GAT T T T T C TAT T AAAG C AG G G G AC AAG AAC A 
AAGAAATCAAATTT ATAAGGCAAAAAGTTGGTTT AGTTTTTCAATTTCCA 
GAAAGTCAGCTTTTTGAAGAGACAGTTTTAAAGGATGTTGCTTTTGGACC 
ACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAA 
AAT T AAGGT TAG T T G G T AT C AGT GAGG AT T TAT T C GAT AAAAAT C CAT T T 
GAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGC 
GAT G G AACC C AAAG TACT AG TACT G GAT GAG C C AAC AG C T G G ACT T GAT C 
C TAAGG G AAG AAAAG AAT T AAT G ACT CT T T T T AAAAAT C T T CAT AAAAAA 
G GAAT G ACT AT C GT C T TAG T G ACT C AC T T AAT GGAC GAT G TAG C GG AT T A 
T G CT G AC TAT GT GT AT GT T T TAG AAG C AGG G AAAG T AAC CT TAT C AG G AC 
AACCAAAACAGATTTTTCAAGAAGTAGAACTTTTAGAAAGTAAACAATTA 
GG AGT T C C C AAAAT C AC C AAG T T T G C T C AAAG AC TAT CT C AT AAGGG AT T 
AAAT T T AC C T AGT T T AC C AAT TAG T AT T AAC GAAT T T GT GG AGG C TAT T A 
AGCATGGA 

SEQ ID NO. 7907 

STRAIN COH1 

GG AAT T GAAT T T AAAAAT GT AAGT TAT AC CT AT CAAGCC 
GGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATCTGAAAATTGA 
AGATGTTTCCTATACCGCGTTCATTGGGCACACAGGTTCTGGAAAATCAA 
CTATTATGCAACTTTTGAATGGTTTACATATTCCTACAAAAGGTGAGGTA 
AT T G T C GAT GAT TTTT CT AT T AAAG C AGGG G AC AAG AAC AAAG AAAT C AA 
AT T T AT AAGG C AAAAAG T T G G T T T AGT T T T T C AATT T C C AG AAAGT C AG C 
TTTTTGAAGAGACAGTTTTAAAGGATGTTGCTTTTGGACCACAAAATTTT 
GGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAAAATTAAGGTT 
AGT T G GT AT C AGT GAG GAT T TAT T C GAT AAAAAT CC AT T T G AAC T T T C T G 
GAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGCGATGGAACCC 
AAAG T AC T AGT AC T G GAT GAG C C AAC AG C T GGAC T T GAT C CT AAGGG AAG 
AAAAG AAT T AAT G ACT CT T T T T AAAAAT C T T C AT AAAAAAG GAAT G AC T A 
TCGTCTTAGTGACTCACTTAATGGACGATGTAGCGGATTATGCTGACTAT 
GTGTATGTTTTAGAAGCAGGGAAAGTAACCTTATCAGGACAACCAAAACA 
GAT TTT T CAAGAAGT AGAACT TTT AGAAAGT AAAC AAT T AGGAGT T CC C A 
AAAT C ACCAAGTTT GCT C AAAGACT AT CT CAT AAGGGATT AAAT TT AC CT 
AGTTTACCAATTACTATTAACGAATTTGTGGAGGCTATTAAGCATGGA 

SEQ ID NO. 7908 

STRAIN M7 81 

GGAAT T GAAT T T AAAAAT GT AAGT TAT AC 

CTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATC 
T GAAAAT T G AAG AT G T T T C C TAT AC C G C GT T CAT T GG G C AC AC AGGT T C T 
GGAAAAT C AAC TAT TAT G C AAC TTTT GAAT G G T T T AC AT AT T C C T AC AAA 
AGGTGAGGTAATTGTCGATGATTTTTCTATTAAAGCAGGGGAC AAGAAC A 
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AAGAAATCAAATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCA 
GAAAGT C AG C T T T T T G AAG AG AC AGT T T T AAAG G AT GT T GC T TT T GG ACC 
ACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAA 
AATTAAGGTTAGTTGGTATCAGTGAGGATTTATTCGATAAAAATCCATTT 
GAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGC 
GATGGAACCCAAAGTACTAGTACTGGATGAGCCAACAGCTGGACTTGATC 
CTAAGGGAAGAAAAGAATTAATGACTCTTTTTAAAAATCTTCATAAAAAA 
GG AAT G ACT AT C GT CT T AGT G AC T C ACT T AAT GG ACG AT GT AGC GG AT T A 
T G CT G AC T AT GT G TAT GT T T TAG AAG CAGGG AAAG TAAC CT T AT C AGG AC 
AACCAAAACAGATTTTTCAAGAAGTAGAACTTTTAGAAAGTAAACAATTA 
GG AGT T C C C AAAAT C AC C AAGT T T G CT C AAAG AC TAT CT C AT AAGGG AT T 
AAAT T T AC C T AGT T T AC C AAT T AC TAT TAAC G AAT T T GT GG AGG C T AT T A 
AGCATGGA 

SEQ ID NO. 7909 

STRAIN CJB110 

G G AAT T GAAT T T AAAAAT GT AAG T TAT AC 

CTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATC 
T G AAAAT T G AAG AT G CT T C C TAT AC CGC GT T CAT T G GG C AC AC AGGT T CT 
GGAAAATCAACTATTATGCAACTTTTGAATGGTTTACATATTCCTACAAA 
AGGTGAGGTAATTGTCGATGATTTTTCTATTAAAGCAGGGGACAAGAACA 
AAGAAATCAAATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCA 
G AAAG T C AG CT T T T T G AAGAGAC AGT T T T AAAGGAT GT T G CT T T T GG AC C 
ACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAA 
AATT AAGGTTAGTTGGTATC AGT GAGG ATT T ATT CG AT AAAAAT CCATTT 
GAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGC 
GAT G GAAC C C AAAGT ACT AGT AC T GG AT GAG C C AAC AG CT GG AC T T GAT C 
C T AAG GGAAG AAAAG AAT T AAT G AC T C T T T T T AAAAAT C T T C AT AAAAAA 
G GAAT G AC TAT C GT C T T AGT G AC T C AC T T AAT G G ACG AT GT AGC G GAT T A 
TGCTGACTATGTGTATGTTTTAGAAGCAGGGAAAGTAACCTTATCAGGAC 
AAC C AAAAC AG AT T T T T C AAG AAGT AG AACT T T T AGAAAGT AAAC AAT T A 
GGAGTT C CC AAAAT C AC C AAG T T T G C T C AAAG AC TAT C T CAT AAG G GAT T 
AAAT T T AC CT AG T T T AC C AAT TACT AT T AACG AAT T T G T GG AG G C TAT T A 
AGCATGGA 

SEQ ID NO. 7910 

STRAIN 1169NT 

GGAAT T GAAT T T AAAAAT GTAA 

GTTATACCTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGAC 
GT C AAT C T G AAAAT T G AAG AT GC T T C CT AT AC CG C G T T CAT T GG G C AC AC 
AGGT T CT GG AAAAT C AAC TAT TAT G C AAC T T T T GAAT G GT T T AC AT AT T C 
CT AC AAAAGGT G AGG T AAT T GT C GAT GAT T T T T CT AT T AAAG C AG G G G AC 
AAG AAC AAAG AAAT C AAAT T TAT AAG G C AAAAAGT T G G T T T AGT T T T T C A 
ATTTCCAGAAAGTCAGCTTTTTGAAGAGACAGTTTTAAAGGATGTTGCTT 
TTGGACCACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCT 
G AAG AAAAAT T AAG G T T AGT T GG T AT C AG T GAGG AT T TAT T C G AT AAAAA 
TCCATTTGAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTA 
TTTTAGCGATGGAACCCAAAGTACTAGTACTGGATGAGCCAACAGCTGGA 
CTTGATCCTAAGGGAAGAAAAGAATTAATGACTCTTTTTAAAAATCTTCA 
TAAAAAAGGAATGACTATCGTCTTAGTGACTCACTTAATGGACGATGTAG 
CGGATTATGCTGACTATGTGTATGTTTTAGAAGCAGGGAAAGTAACCTTA 
T C AGG AC AAC C AAAAC AG AT T T T T C AAGAAGT AGAAC T T T TAG AAAG TAA 
AC AAT TAG GAGT T C C C AAAAT C AC C AAG T T T G CT C AAAG AC TAT C T CAT A 
AGGG AT T AAAT T T AC C T AGT T T AC C AAT TACT AT TAAC GAAT TT G T GG AG 
GCTATTAAGCATGGA 

SEQ ID NO. 7911 

STRAIN JM9130013 

GGAATT GAAT TT AAAAAT GT AAGT T 

ATACCTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTT 
AAT C T G AAAAT T G AAG AT G C T T C CT AT AC CG CAT T CAT T GG G C AC AC AGG 
TTCTGGAAAATCAACTATTATGCAACTTTTGAATGGTTTACATATTCCTA 
CAAAAGGT GAGGT AAT T GT C GAT GATT TT T CT AT T AAAGC AGGGG AC AAG 
AAC AAAG AAAT C AAAT T TAT AAG G C AAAAAGT T GGT T T AGT T T T T C AAT T 
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T C C AG AAAGT C AGC T T T T T G AAG AG AC AGT T T T AAAG GAT GT TGCTTTTG 
GACCACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAA 
GAAAAATTAAGGTTAGTTGGTATTAGTGAGGATTTATTCGATAAAAATCC 
ATTTGAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTT 
TAG C GAT G G AAC C C AAAGT AC T AGT ACT GG AT GAG C C AAC AG C T GG ACT T 
GAT C CT AAGG G AAG AAAAG AAT T AAT G AC T C T T T T T AAAAAT C T T C AT AA 
AAAAG GAAT G ACT AT C GT CT T AGT G ACT C AC T T AAT GG AC G AT GT AG CG G 
ATTATGCTGACTATGTGTATGTTTTAGAAGCAGGGAAAGTAACCTTATCA 
G GAC AAC C AAAAC AG AT T TT T C AAG AAGT AGAAC T T T TAG AAAG T AAAC A 
AT T AGG AG T T C C C AAAAT C AC C AAG T T T G C T C AAAGACT AT C T CAT AAG G 
G AT T AAAT T T AC C T AG T T T AC C AAT T AC TAT T AAC GAAT T T GT GG AGGC T 
ATTAAGCATGGA 

SEQ ID NO. 7912 
STRAIN 2 603 frame: 1 

MGIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
E R L AE E KLRL VG I S E D L FDKN P FE L S G GQMRRVA I AG I LAME PK VLVL DE P TAG L D PKGR 

KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7913 

STRAIN 090 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAME PKVLVLDEPTAGLDPKGR 

KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINE FVEAIKHG 

SEQ ID NO. 7914 

STRAIN 0 90 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAMEPKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7915 

STRAIN H36B frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGI LAME PKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINE FVEAIKHG 

SEQ ID NO. 7916 

STRAIN 18RS21 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAME PKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINE FVEAIKHG 

SEQ ID NO. 7 917 

STRAIN M7 32 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDVSYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAMEPKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7918 

STRAIN COH1 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDVSYTAFIGHTGSGKSTIMQLLNGLHIPTK 
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GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVG I SE DLFDKN P FE LS GGQMRRVAI AG I LAME PKVLVLDE PT AGLD PKGR 
KE LMT L FKN LHKKGMT I VLVTHLMD D VAD YAD Y VYVLE AGKVT LSGQ PKQ I FQEVE L LE S 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7919 

STRAIN M7 81 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDVSYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAMEPKVLVLDEPTAGLDPKGR 
KE LMT L FKN L HKKGMT I VL VT H LM D D VAD YAD Y V Y V LE AGKVT LSGQ PKQ IFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7920 

STRAIN CJB110 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAMEPKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7921 

STRAIN 1169NT frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
E RL AE E KL RL VG I SE DLFDKN P FE LSG GQMRR VA I AGILAMEPKVLVLDEPT AGLD PKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7922 

STRAIN JM9130013 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISE DLFDKN PFELSGGQMRRVAIAGI LAME PKVLVLDE PTAGLD PKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGECVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 8001 
STRAIN 2603 

GT G AAC C AC T T AC T T AAC CT C AGT AAAG AAAAT AT AG C T AAAAT AG AT T T T G ACT T T CT T 
AAT G AGG C ACT T AAT G C AAAT AT T C GT T T G AAAG AAT T AGT AGAT G AAC T AAAAAT T T C A 
AAAG AACT G GAC AG T AAAGGT T GGT C C AAAAAAG AC T C T CG AAC G AT AAAAAT C T T GT AC 
GAT G G C CT TAT C AAT AAAC AT AT AGT T T C C C TAG AT C GT G C AG AT TAT AAC AT TAT CC AA 
GTCATTCCATTTGCTAATGTACATGTACTACTGTTTTTAATACCAGAAAGGGAGAATTCT 
AAAAAT TAT AG AAT AT AC AACT AC AGT GAT TAT G AAAT G GAG T T AAT C AAT GAG GAT AGG 
C AAC AAT T T T C AAAAT AT G AAAC AG T T GAT T T AGAC C AAT T GAT ACT T GT T GAT AT T T T T 
AAT ATT GAT G ACT ACATTT CAT C AT AT TT AAC AAT A 

SEQ ID NO. 8002 
STRAIN H3 6B 

AAC C AC T TAG T T AAC C T C AG T AAAG AAAAT AT AG C T 

AAAAT AGAT TTTGACTTTCTTAATGAGGCACTT AAT GCAAAT ATT CGTTT 
G AAAG AAT T AGT AG AT G AAC T AAAAAT T T C AAAAG AAC T GG AC AGT AAAG 
GTT GGT CCAAAAAAGACTCTCGAAC GAT AAAAAT CTTGTACGATGGCCTT 
AT C AAT AAAC AT AT AG T T T C C C T AGAT C GT G C AG AT TAT AAC AT TAT C C A 
AGTCATTCCATTTGCTAATGTACATGTACTACTGTTTTTAATACCAGAAA 
GG GAG AAT T CT AAAAAT TAT Ag AAT AT AC AAC T AC AGT GAT TAT G AAAT G 
GAGTTAATCAATGAGGATAGGCAACAATTTTCAAAATATGAAACAGTTGA 
TTTAGACCAATTGATACTTGTTGATATTTTTAATATT GAT G ACT ACATTT 
CATCATATTTAACAATA 

SEQ ID NO. 8003 
STRAIN 18RS21 

AACCACTTACTTAACCTCAGTAAAGAAAATATAG 
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CTAAAATAGATTTTGACTTTCTTAATGAGGCACTTAATGCAAATATTCGT 
T TGAAAGAAT TAGTAGAT GAACTAAAAATTT CAAAAGAACTGGACAGT AA 
AGGT T G GT C C AAAAAAG AC T CT C GAACG AT AAAAAT CT T GT AC G AT GG C C 
T TAT C AAT AAAC AT AT AG T T T C C CT AG AT C GT G C AGAT T AT AAC AT T AT C 
C AAGT C AT T C CAT T T G C T AAT GT AC AT GT AC TACT GT T TT T AAT AC C AGA 
AAGGGAGAATT CT AAAAATTAT AGAATATACAACTACAGTGATT AT GAAA 
T GGAGTT AAT CAATGAGGATAGGCAACAATTTT CAAAAT ATGAAACAGT T 
GATTTAGACCAATTGATACTTGTTGATATTTTTAATATTGATGACTACAT 
T T CAT CAT AT T T AAC AAT A 

SEQ ID NO. 8004 
STRAIN 2 603 frame: 1 

VNHLLNLSKENIAKIDFDFLNEALNANIRLKELVDELKISKELDSKGWSKKDSRTIKILY 
DGLINKHIVSLDRADYNIIQVIPFANVHVLLFLIPERENSBCNYRIYNYSDYEMELINEDR 
QQFSKYETVDLDQLILVDIFNIDDYISSYLTI 

SEQ ID NO. 8005 

STRAIN H36B frame: 1 

NHLLNLSKENIAKIDFDFLNEALNANIRLKELVDELKISKELDSKGWSKKDSRTIKILYD 
GLINKHIVSLDRADYNIIQVIPFANVHVLLFLIPERENSKNYRIYNYSDYEMELINEDRQ 
QFSKYETVDLDQLILVDIFNIDDYISSYLTI 

SEQ ID NO. 8006 

STRAIN 18RS21 frame: 1 

NHLLNLSKENIAKIDFDFLNEALNANIRLKELVDELKISKELDSKGWSKKDSRTIKILYD 
GLINKHIVSLDRADYNIIQVIPFANVHVLLFLIPERENSKNYRIYNYSDYEMELINEDRQ 
QFSKYETVDLDQLILVDIFNIDDYISSYLTI 

SEQ ID NO. 8101 
STRAIN 090 

AGCAAGCCTAATGTTGTTCAGTTAAA 

T AAT CAAT AT ATT AACGATG AGAAT CT AAAAAAACGT T ACGAAGCT GAGG 
AGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATG 
CTTTTATTTATTTTACCCACTTATAATTTAGTTAAGAGTTACAGAACTTT 
AC AAGAACGTCGT CAAGAAGTTGTAAAATTAACGAAAGACTAT CAGACAT 
T AAC T AAT AG AAC T G AGAAC C AG AAGT T G CT AG C AAAAC AAC T AAAAAAT 
C C AG AT T AC G T T C AAAAAT AT G C T C GAG C T AAG TAT TAT T T C T C T AAG AC 
C GG C G AAAT GAT T T AC C C AT T AC C AG AC C T T T T AC C AAAA 

SEQ ID NO. 8102 

STRAIN A909 

AGCAAGC CTAATGT T GTT CAGTT AAAT AAT C AAT A 

T AT T AAC GAT GAG AAT C T AAAAAAAC G T T ACG AAG C T G AGG AGT T ACGC C G AAAAAAT C G 
TTTAATGGGTTGGGTTCTTATTTTTGTCATGCTtttATTTATTTTACCCACTTATAATTT 
AGT T AAG AG T T AC AG AAC T T TAG AAG AAC GT CGT C AAGAAGT T GT AAAAT T AAC G AAAG A 
C T AT CAGACAT T AAC T AAT AG AACT GAG AAC C AG AAGT TACT AG C AAAAC AACT AAAAAA 

TCCAGATTACGTTCAAAAATATGCTCGAGCTAAGTATTATTTCTCTAAGACCGGCGAAAT 
GAT T T AC C C AT TAG C AG AC C T 

SEQ ID NO. 8103 
STRAIN H3 6B 

AGCAAGCCTAATGTTGTTCAGTTAAA 

T AAT CAAT AT AT T AAC GAT GAG AAT C T AAAAAAAC GT T AC G AAG C T GAG G 
AGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATG 
CTTTTATTTATTTTACCCACTTATAATTTAGTTAAGAGTTACAGAACTTT 
ACAAGAACGT CGT CAAG AAGT T GTAAAATT AAC GAAAGACT AT CAGACAT 
T AACT AAT AGAACT GAGAAC CAGAAGTT ACT AGCAAAACAACT AAAAAAT 
C C AG AT T AC GTT C AAAAAT AT G C T C G AG CT AAG TAT TAT T T C T C T AAG AC 
C G G C G AAAT GAT T T AC C CAT TAG C AG AC C t T T T AC C AAAA 

SEQ ID NO. 8104 

STRAIN 18RS21 

AGCAAGC CT AAT GTT GT T C AGT T AAAT AAT CAAT AT AT TAACGATGAGAAT CTAAAAAAA 
CGTTACGAAGCTGAGGAGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTT 
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GT CAT G C T T T TAT T TAT T T TAG C C AC T T AT AAT T TAG T T AAG AGT T AC AG AAC T T T AC AA 

GAACGT CGT CAAGAAGTTGTAAAATT AACGAAAG ACTAT C AGACATTAACT AATAGAACT 

GAG AAC C AG AAGT T G CT AGC AAAAC AAC T AAAAAAT C C AGAT T AC G T T C AAAAAT AT G C T 

CGAGCTAAGTATTATTTCTCTAAGACCGGCGAAATGATTTACCCATTACCAGACCTTTTA 
CCAAAA 

SEQ ID NO. 8105 
STRAIN M732 

AG C AAG C CT AAT GT T GTT C AGT T AAA 

T AAT C AAT AT AT T AAC GAT G AGAAT C T AAAAAAAC G T T ACGAAGC T G AGG 
AGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATG 
C T T T TAT T TAT T T T AC C C AC T TAT AAT T T AGT T AAG AGT T AC AG AAC T T T 
AC AAG AAC G T C GT C AAGAAGT T GT AAAAT T AAC GAAAGACT AT C AG AC AT 
T AAC T AAT AG AAC T G AG AAC C AGAAGT T AC T AG C AAAAC AAC T AAAAAAT 

CCAGATTACGTTCAAAAATATGCTCGAGCGAAGTATTATTTCTCTAAGAC 
CG G C G AAAT GAT T TAG C C AT T AC C AGAC C t T T T AC C AAAA 

SEQ ID NO. 8106 
STRAIN COH1 

AG C AAG C C T AAT GT T GT T C AGT T AAAT AAT C 

AAT AT AT T AAC GAT GAG AAT C T AAAAAAAC G T T ACGAAGC T GAG GAGT T A 
CGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATGCTttt 
AT T T AT T T TAG C C ACT TAT AAT T T AGT T AAGAGT T AC AG AACT T T AC AAG 
AAC G T C GT C AAG AAGT T GT AAAAT T AAC GAAAGACT AT C AG AC AT T AAC T 
AAT AGAAC T GAG AAC C AGAAGT TAG TAG C AAAAC AAC T AAAAAAT C C AG A 

T T AC GT TC AAAAAT ATGCTCGAGCG AAGT ATT AT TTCTCT AAG AC CGGCG 
AAAT GAT T T AC C C AT T AC C AG AC CT T T T AC C AAAA 

SEQ ID NO. 8107 
STRAIN M7 81 

AGCaAGCCTAATGTTGTTCAGTT 

AAAT AAT C AAT AT a T T AAC GAT GAG AAT CT AAAAAAAC G T T AC GAAGC T G 
AGGAGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTC 
ATGCTTTTATTTATTTTACCCACTTATAATTTAGTTAAGAGTTACAGAAC 
T T T AC AAGAAC GT CGT C AAG AAGT T G T AAAAT T AAC G AAAGAC T AT C AG A 
C AT T AAC T AAT AGAAC T G AGAAC C AGAAGT T AC T AGC AAAAC AACT AAAA 

AATCCAGATTACGTTCAAAAATATGCTCGAGCGAAGTATTATTTCTCTAA 
GAC C GG C G AAAT GAT T T AC C CAT TAG C AG AC C t T T T ACC AAAA 

SEQ ID NO. 8108 
STRAIN CJB110 

AG C AAG C C T AAT G T T GT T C AGT T AAAT AAT C 

AAT AT AT T AAC GAT G AGAAT C T AAAAAAAC GT T AC G AAG C T GAG GAG T T A 
CGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATGCTttt 
AT T T AT T T T AC C C AC T TAT AAT T TAG T T AAG AGT T AC AG AAC T T T AC AAG 
AAC G T C GT C AAG AAGT T GT AAAAT T AAC G AAAGAC TAT C AG AC AT T AAC T 
AAT AG AAC T GAG AAC C AGAAGT T G C T AG C AAAAC AAC T AAAAAAT C C AG A 

TTACGTTC AAAAAT AT GCTCGAGCT AAGT ATT ATTTCTCTAAGACCGGCG 
AAAT GAT T T AC C CAT T AC C AG AC C t T T T AC C AAAA 

SEQ ID NO. 8109 
STRAIN 1169NT 

AGCAAGCCTAATGTTGTTCAGTTAAA 

T AAT C AAT AT AT T AAC GAT G AGAAT C T AAAAAAAC G T TAG G AAG C T GAG G 
AGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATG 
C T T T TAT T TAT T T T AC C C ACT TAT AAT T TAG T T AAG AGT T AC AG AAC T T T 
AC AAGAAC G T C GT C AAG AAGT T GT AAAAT T AAC G AAAGAC TAT C AGAC AT 
T AAC T AAT AG AAC T GAG AAC C AG AAGT TAG TAG C AAAAC AAC T AAAAAAT 

CC AGAT TAG GTT CAAAAAT AT GCTCGAGCT AAG TAT T AT TTCTCTAAGAC 
C G G C G AAAT GAT T T AC C CAT T AC C AG AC C t T T TAG C AAAA 

SEQ ID NO. 8110 

STRAIN JM9130013 

AGCaAGCCTAATGTTGTTCAGTTAAA 
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T AAT C AAT AT AT T AAC GAT GAG AAT C T AAAAAAAC GT T AC G AAG C T G AGG 
AGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATG 
C T T T TAT T T AT T T T AC C C ACT TAT AAT T T AGT T AAG AGT T AC AG AACT T T 
AC AAGAACG T CGT C AAG AAGT T GT AAAAT T AAC G AAAG AC TAT C AGAC AT 
TAACTAATAGAACTGAGAACCAGAAGTTACTAGCAAAACAACTAAAAAAT 
C C AG AT T ACGT T C AAAAA.T AT G CT C G AGC GAAGT AT TAT T T CT C T AAG AC 
T G G CG AAAT GAT T T AC C CAT T AC C AGAC C t T T T ACC AAAA 

SEQ ID NO. 8111 
STRAIN 2 603 

agcaagcctaatgttgttcagttaaataatcaatatattaacgatgagaa 
tctaaaaaaacgttacgaagctgaggagttacgccgaaaaaatcgtttaa 
tgggttgggttcttatttttgtcatgcttttatttattttacccacttat 
aatttagttaagagttacagaactttacaagaacgtcgtcaagaagttgt 
aaaattaacgaaagactatcagacattaactaatagaactgagaaccaga 
agttgctagcaaaacaactaaaaaatccagattacgttcaaaaatatgct 
cgagctaagtattatttctctaagaccggcgaaatgatttacccattacc 
agaccttttaccaaaa 

SEQ ID NO. 8112 
STRAIN 0 90 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPDLLPK 

SEQ ID NO. 8113 

STRAIN A90 9 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEVVKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPD 

SEQ ID NO. 8114 

STRAIN H3 6B 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEVVKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPDLLPK 

SEQ ID NO. 8115 

STRAIN 18RS21 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNLVKSYRTLQ 
ERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEMIYPLPDLL 
PK 

SEQ ID NO. 8116 

STRAIN M732 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPDLLPK 

SEQ ID NO. 8117 

STRAIN COH1 

SKPNVVQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNLVK 

SYRTLQERRQEVVKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEMIY 

PLPDLLPK 

SEQ ID NO. 8118 

STRAIN M781 

SKPNVVQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYN 

LVKSYRTLQERRQEVVKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGE 

MIYPLPDLLPK 

SEQ ID NO. 8119 

STRAIN CJB110 

SKPNVVQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNLVK 
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SYRTLQERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEMIY 
PLPDLLPK 

SEQ ID NO. 8120 

STRAIN 1169NT 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPDLLPK 

SEQ ID NO. 8121 

STRAIN JM9130013 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPDLLPK 

SEQ ID NO. 8122 

STRAIN 2603 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNLVKSYRTLQ 
ERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEMIYPLPDLL 
PK 

SEQ ID NO. 8201 
STRAIN 2603 

ATGAAAAATTTATTGTTAAAATGTAAGGATAAGAAGGTTAAAGCATTTACACTTTTAGAA 
TGTTTGGTAGCATTGGTTACAATCACAGGAGCTTTACTAGTTTATCAAGGACTGACAAAA 
TTGTTGGCTCAACAGATAGTAGTGATGTCTTCTTCCAGTCAGTCTGAATGGGTGTTATTA 
Ac T C AG C AAC T AAAT G C AG AAT T T G AAGG C G C T CAT C T GGAAT AT T T AAGAC AG AAC AAA 
CTTTATTTACGTAAGCAAGATAAGATTGTAACCTTTGGCAAATCTAATAAAGATGATTTC 
CGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTTATGGGTTAGACAATTGT 
CAAATGAGTCAGACCAAAAGTATGGTAAAACTTGTTTTTTATTTTAAGGACGGGTTAAAA 
AG G AC AT T T TACT AT GAT T T T AAAG AAG AAAC T T AA 

SEQ ID NO. 8202 

STRAIN 0 90 

AAT T CG AAGGC G C T C ACT T GGAAT AT T T AAGAC AG AAC AAAC T T T AT T T A 
CG T AAGC AAG AT AAGAT T GT AAC C T T T GG C AAAT C T AAT AAAGAT GAT T T 
CCGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTTATGGGT 
TAGACAATTGTCAAATGAGTCAAACCAAAAGTATGGTAAAACTTGTTTTT 
TAT T T T AAGG ACG GGT T AAAAAGG AC AT T T TACT AT GAT T T T AAAG AAG A 
AACT 

SEQ ID NO. 8203 

STRAIN A909 

C AG AAT T T GAAGG C G CT CAT C T G G AAT ATT T AAG AC AG AAC AAAC T T T AT 
TTACGT AAGC AAGAT AAGAT TGT AAC CTTTGGC AAAT CT AAT AAAG ATGA 
TTTCCGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTTATG 
GGTTAGACAATTGTCAAATGAGTCAGACCAAAAGTATGGTAAAACTTGTT 
T T T T AT T T T AAGG ACG GG T TAAAAAG G AC AT T T T AC T AT GAT T T T AAAG A 
AGAAACT 

SEQ ID NO. 8204 

STRAIN H3 6B 

AT GC AGAATT T GAAGG CGCT CAT CT GGAAT AT TT AAGAC AG AAC AAACT T 
TAT T T AC GT AAG C AAG AT AAG AT T GT AAC C T T T GG C AAAT C T AAT AAAG A 
TGATTTCCGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTT 
AT G G G T T AGAC AAT TGT C AAAT GAG T C AG AC C AAAAGT AT G GT AAAAC T T 
GT T T T T TAT T T T AAG GAC G GGT TAAAAAG G AC AT T T T ACT AT GAT T T T AA 
AGAAGAAACT 

SEQ ID NO. 8205 

STRAIN 18RS21 

AGAAT T TGAAGGCGCT CAT CT GGAAT ATT T AAGAC AGAACAAACTT T ATT 
TACGT AAGC AAGAT AAGATT GTAACCT T T GGC AAAT CT AAT AAAGAT GAT 
TTCCGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTTATGG 
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GTTAGACAATTGTCAAATGAGTCAGACCAAAAGTATGGTAAAACTTGTTT 
T T TAT T T T AAGG AC G GGT T AAAAAG GAG AT T T T AC TAT GAT T T T AAAG AA 
GAAACT 

SEQ ID NO. 8206 

STRAIN M732 

CAGAATTCGAAGGCGCT CACTTGGAAT ATTTAAGACAGAAC AAACT TT AT 
TT ACGT AAGCAAGATAAGATTGT AAC CT TTGGCAAAT CT AAT AAAG AT GA 
TTT CCGT AAGACAGGT TATAAT GGT CGAGGTT AT C AAC CAAT GGTTTATG 
GGT TAG AC AAT TGT C AAAT G AGT C AG AC C AAAAGT AT GGT AAAAC T T G TT 
T T T T AT T T T AAGGAC GG GT T AAAAAG GAC AT T T TACT AT GAT T T T AAAG A 
AGAAACT 

SEQ ID NO. 8207 

STRAIN COH1 

G AAT T CG AAGG CG C T C AC T TGG AAT AT T TAAG AC AG AAC AAACT T TAT T T 
ACGT AAGCAAGATAAGAT TGT AACCTTT GGCAAAT CTAATAAAGAT GATT 
TCCGTAAGACAGGTTATAATGGTCGAGGTTATCAACCAATGGTTTATGGG 
T TAG AC AAT T GT C AAAT G AGT C AG AC C AAAAGT AT GGT AAAAC T T GT T T T 
T T AT T T T AAGGAC GGGT T AAAAAG GAC AT T T T AC TAT GAT T T T AAAGAAG 
AAACT 

SEQ ID NO. 8208 

STRAIN M7 81 

AG AAT T C G AAGG C G C T C AC T T G G AAT AT T T AAGAC AGAAC AAAC T T T AT T 
T ACGT AAGCAAGATAAGAT T GTAACCT TTGGCAAAT CTAATAAAGAT GAT 
T T C CGT AAGAC AG GT TAT AAT G GT C GAG G T T AT C AAC CAAT GGT T TAT G G 
GTTAGAC AAT TGT CAAAT GAGT CAGAC C AAAAGT AT GGT AAAACT TGTT T 
T T TAT T T TAAG GACGGG T T AAAAAG GAC AT T T T AC T AT GAT T T T AAAG AA 
GAAACT 

SEQ ID NO. 8209 

STRAIN CJB110 

GAATTCGAAGGCGCTCACTTGGAATATTTAAGACAGAACAAACTTTATTT 
AC GT AAGCAAGATAAGAT T G T AAC C T T T G G CAAAT C T AAT AAAG AT GAT T 
TCCGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTTATGGG 
TTAGACAATTGTCAAAT GAGT CAAACCAAAAGT AT GGT AAAACT T GTTTT 
T TAT T T T AAGG AC GGGT T AAAAAG GAC AT T T TACT AT GAT T T T AAAG AAG 
AAACT 

SEQ ID NO. 8210 

STRAIN 1169NT 

TCGAAGGCGCTC ACT TGGAAT ATTT AAG AC AGAAC AAACTTTATTT ACGT 
AAGCAAGATAAGAT T GT AAC C T T T G G CAAAT C T AAT AAAG AT GAT T T T C G 
TAAG AC AG GT TAT GAT G GT C G AGGT T AT C AAC CAAT GGT T TAT GGGT TAG 
AC AAT TGT CAAAT GAGT C AAAC C AAAAGT AT G G T AAAACT TGTTTTTT AT 
T T T AAGG AC GG GT T AAAAAG GAC AT T T T AC TAT GAT T T T AAAG AAG AAAC 
T 

SEQ ID NO. 8211 

STRAIN JM9130013 

T G C AG AAT T T G AAG G C G C T CAT C T G G AAT AT T TAAG AC AG AAC AAAC TTT 
AT T T AC GT AAG C AAG AT AAG AT TGT AAC C T T T GG CAAAT C T AAT AAAG AT 
GAT T T CC G TAAG AC AG GT T AT GAT G GT C G AGGT TAT C AAC CAAT G GT T T A 
T G G G T TAG AC AAT TGT CAAAT GAG T CAGAC C AAAAGT AT GGT AAAACT T G 

TTTTTTATTTTAAGGACGGGTTAAAAAGGACATTTTACTATGATTTTAAA 
GAAGAAACT 

SEQ ID NO. 8212 

STRAIN 2 603 frame: 1 

MKNLLLKCKDKICVKAFTLLECLVALVT ITGALLVYQGLTKLLAQQI VVMS S S S QSEWVLL 
TQQLNAEFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNC 
QMSQTKSMVKLVFYFKDGLKRTFYYDFKEET. 
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SEQ ID NO. 8213 

STRAIN 090 frame: 3 

FEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQTKS 
MVKL VF Y FKDGLKRT F Y YD FKEE T 

SEQ ID NO. 8214 

STRAIN A909 frame: 3 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQTK 
SMVKLVFYFKDGLKRT FYYDFKEET 

SEQ ID NO. 8215 

STRAIN H3 6B frame: 3 

AEFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQT 
KSMVKLVFYFKDGLKRTFYYDFKEET 

SEQ ID NO. 8216 

STRAIN 18RS21 frame: 2 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQTK 
SMVKLVFYFKDGLKRTFYYDFKEET 

SEQ ID NO. 8217 

STRAIN M732 frame: 3 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYNGRGYQPMVYGLDNCQMSQTK 
SMVKLVFYFKDGLKRTFYYDFKEET 

SEQ ID NO. 8218 

STRAIN COH1 frame: 1 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYNGRGYQPMVYGLDNCQMSQTK 
SMVKLVFYFKDGLKRTFYYDFKEET 

SEQ ID NO. 8219 

STRAIN M7 81 frame: 2 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYNGRGYQPMVYGLDNCQMSQTK 
SMVKLVFYFKDGLKRT FY YD FKEET 

SEQ ID NO. 8220 

STRAIN CJB110 frame: 1 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQTK 
SMVKLVFYFKDGLKRTFYYDFKEET 

SEQ ID NO. 8221 

STRAIN 1169NT frame: 3 

EGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQTKSM 
VKLVFYFKDGLKRT FYYDFKEET 

SEQ ID NO. 8222 

STRAIN JM9130013 frame: 2 

AEFEGAHLEYLRQNKLYLRKQDKIVT FGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQT 
KSMVKLVFY FKDGLKRT FYYDFKEET 

SEQ ID NO* 8301 
STRAIN 2603 

atgaaaaagattcgattatcaaagtttattaaaatgattgttgttattttgtttttaatt 
agtgtagcagctagtttttattttttccacgttgcccaagttcgagatgataaatccttt 
atttcaaatggtcaacgtaagcctggaaactctttatatgcttatgataaatcctttgat 
aagctattaaagcaaaaaatagaaatgacaaaccaaaatataaagcaagttgcttggtat 
gttcctgctgttaagaaaactcataagacagctgttgtcgttcatggttttgcgaatagc 
aaagagaatatgaaggcatatggttggctgtttcataagttaggatacaatgttcttatg 
cctgacaatattgcacatggtgaaagtcatgggcagttgataggctatggctggaacgac 
cgcgagaacattatcaaatggacagaaatgatagttgataagaatccatcaagccaaatt 
actttatttggtgtttcaatgggtggagcaacagtcatgatggctagtggtgaaaaatta 
cctagtcaggttgttaatatcattgaagattgcggttattctagtgtttgggatgaatta 
aaatttcaggctaaagagatgtatggtttaccagccttcccactcttatatgaagtttca 
acaatttctaaaatcagagcaggtttttcgtatggacaagcaagtagtgtcgaacaattg 
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aaaaagaataatttaccagccctctttattcatggtgataaggataattttgttccaaca 

agtatggtttatgacaactataaagctacagcaggtaagaaagagctttatattgtaaaa 

ggggcaaaacatgcgaaatcttttgaaacagagccagaaaaatatgagaaacgtatctct 
agttttttgaaaaaatatgaaaaa 

SEQ ID NO. 8302 
STRAIN 090 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCG 

AGATGATAAATCCTTTATTTCAAATGGTCAACGTAAGCCTGGAAACTCTT 
TAT AT GC T TAT G AT AAAT C C T T T G AT AAGCT AT T AAAGC AAAAAAT AG AA 
ATGACAAACCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGCTAA 
GAAAACT C AT AAG AC AG CTGTTGTCGTT CAT GGTTTTGC G AAT Ag C AAAG 
AG AAT AT GAAG G CAT AT G GT TGGCTGTTT C AT AAGT T AGG AT AC AAT GT T 
c T T AT G C C T G AC AAT AT T G C AC AT GG t G AAAGT CAT G GG C AG T T GAT AG G 
C TAT GG C T G GAACG AC C GC GAG AAC AT TAT C a AAT G G AC AGAAAT GAT AG 
TTGATAAGAATCCATCAAGCCAAATTACTTtaTTTGGTGTTTCAATGGGT 
GGAGCAACAGTCATGATGGCTAGTGGTGAAAAATTACCTAGTCAGGTTGT 
TAATATCATTGAAGATTGCGGTTATTCTAGTGTTTGGGATGAATTAAAAT 
TTCAGGCTAAAGAGATGTATGGTTTACCAGCCTTCCCACTCTTATATGAA 
GT T T C AAC AAT T T C T AAAAT C AG AG C AGG T T T T T C GT AT G G AC AAG C AAG 
T AGT G T CG AAC AAT T G AAAAAGAAT AATT T AC C AG C C C T C T T T AT T CAT G 
GTGATAAGGATAATTTTGTTCCAACAAGTATGGTTTATGACAACTATAAA 
G C T AC AGC AG GT AAGAAAG AG CT T TAT AT T GT AAAAG GGGC AAAAC AT GC 
G AAAT C T T T T G AAAC AG AG C C AG AAAAAT AT GAG AAAC GT AT C T C T AGT T 
T T T T G AAAAAAT AT G AAAAA 

SEQ ID NO. 8303 
STRAIN A909 

AATCCTTTATTTCAAATGGTCAACGTAAGCCTGGAAACTCTTTATATGCT 
TAT GAT AAAT C C T T T GAT AAG C T AT T AAAG C AAAAAAT AG AAAT G AC AAA 

CCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGCTAAGAAAACTC 
AT AAG AC AG C T GT T GT CGT T CAT GGTTTTGC G AAT AG C AAAG AG AAT AT G 

AAGGCATATGGTTGGCTGTTTCATAAGTTAGGATACAATGTTCTTATGCC 
T G AC AAC AT T G C AC AT GG T G AAAGT CAT G G G C AGT T GAT AG G CT AT G G CT 
GG AAC G AC C G C G AGAAC AT T AT C AAAT GG AC AG AAAT GAT AGT T GAT AAG 

AATTCATCAAGCCAAATTACTTTATTTGGTGTTTCAATGGGTGGAGCAAC 
AGT CAT GAT G G C T AGT G G T G AAAAAT T AC CT AGT C AG GT T GT T AAT AT C A 

TTGAAGAtTGCGGTTATTCTGGTGTTTGGGATGAATTAAAATTTCAGGCT 
AAAG AG AT GT ATGG TT T AC C AG C C T T C C C AC T C T TAT AT G AAGT T T C AAC 
AAT T T CT AAAAT C AG AG C AG GT T T T T C G TAT GG AC AAg C AAG T AGT GT C G 
AAC AAT T GAAAAAG AAT AAT T T AC C AGC CC T CT T TAT T CAT G GT GAT AAG 
GAT AAT T T T G T T C C AAC Aa G T AT G GT T T AT G AC AAC TAT AAAG C T AC AG C 
AGGT AAGAAAG AGC TT TAT AT T GT AAAAGGG G C AAAAC AT G C G AAAT CT T 
T T G AAa C AG AG C C AG AAAAAT AT GAG AAAC G T AT CT C T AGT T T T T T G AAA 
AAAT AT G AAAAA 

SEQ ID NO. 8304 

STRAIN H36B 

AGTTTTTATTTTTTCCACGTTGCCCAAGTTCGAGATGATAAATCCTTTAT 
T T C AAAT GGT C AAC GT AAG C C T G GAAAC T CT T TAT AT G C T T AT GAT AAAT 
C CT T T GAT AAG CT ATT AAAG C AAAAAAT AG AAAT G AC AAAC C AAAAT AT A 
AAG C AAGTTGCTTGGTATGTTCCTGCTGCT AAG AAAAC T CAT AAG AC AGC 
TGTTGTCGTT CAT GGTTTTGC G AAT AG C AAAG AG AAT AT GAAG G CAT AT G 

GTTGGCTGTTTCATAAGTTAGGATACAATGTTcTTATGCCTGACAACATT 
G C AC AT GGT G AAAGT CAT G G G C AGT T GAT AGG CT AT G G C T GG AAC G AC C G 
C GAG AAC AT TAT C AAAT GG AC AG AAAT GAT AG T T GAT AAG AAT T CAT C AA 

GCCAAATTACTTTATTTGGTGTTTCAATGGGTGGAGCAACAGTCATGATG 
G C T AGT G G T G AAAAAT T AC C TAG T C AG GT T GT T AAT AT CAT T GAAG AT T G 

CGGTTATTCtGGTGTTTGGGATGAATTAAAATTTCAGGCTAAAGAGATGT 
ATGGTTTACCAGCCTTCCCACTCTTATATGAAGTTTCAACAATTTCTAAA 
AT C AGAG C AG GT T T T T C GT AT G G AC AAg C AAGT AG T G T CG AAC AAT T G AA 
AAAG AAT AAT T T AC C AG C CC T C T T T AT T CAT G GT G AT AAG GAT AAT T T T G 
T T C C AAC AAGT AT GGTT T AT G AC AAC TAT AAAG CT AC AG C AGGT AAG AAA ' 
GAG C T T TAT AT T GT AAAAG G GG C AAAAC AT G C G AAAT C T T T T GAAAC AGA 
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GCC AG AAAAAT AT GAG AAAC G T AT CT CT AGT T T T T T G AAAAAa T AT g AAA. 
AA 

SEQ ID NO. 8305 

STRAIN 18RS21 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCGA 

GAT GAT AAAT C CT TT AT T T C AAAT GGT C AAC GT AAG C CT GG AAAC T CT T T 
AT AT G CT T AT G AT AAAT C CT T T GAT AAGCT AT T AAAG C AAAAAAT AGAAA 
TGACAAACCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGTTAAG 
AAAACTCATAAGACAGCTGTTGTCGTTCATGGTTTTGCGAATAGCAAAGA 
GAAT AT G AAGG CAT AT GGT T G G C T GT T T C AT AAGT T AGG AT AC AAT GT T C 
TTATGCCTGACAATATTGCACATGGTGAAAGTCATGGGCAGTTGATAGGC 
TAT G G CT GG AAC G AC CGC G AGAAC AT TAT C AAAT GGAC AGAAAT GAT AGT 
TGATAAGAATCCATCAAGCCAAATTACTTTATTTGGTGTTTCAATGGGTG 
GAGCAACAGTCATGATGGCTAGTGGTGAAAAATTACCTAGTCAGGTTGTT 
AATATCATTGAAGATTGCGGTTATTcTAGTGTTTGGGATgAATTAAAATT 
TCAGGCTAAAGAGATGTATGGTTTACCAGCCTTCCCACTCTTATATGAAG 
T T T C AAC AAT T T CT AAAAT C AGAG C AG GT T T T T C GT AT G G AC AAg C AAGT 
AG T GT C GAAC AAT T GAAAAAG AAT AAT T T AC C AG C C CT C T T TAT T CAT GG 
T G AT AAGGAT AAT T T T GT T C C AACAAG T AT GG T T TAT G AC AACT AT AAAG 
CTACAGCAGGTAAGAAAGAGCTTTATATTGTAAAAGGGGCAAAACATGCG 
AAAT C T T T T GAAa C AG AG C C AG AAAAAT AT GAGAAACGT AT CT C T AGT T T 
T T T G AAAAAAT AT GAAAAA 

SEQ ID NO. 8306 

STRAIN M732 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCGA 

GAT GAT AAAT C C T T T AT T T C AAAT GGT C AACGT AAG C CT GG AAAC T C T T T 
AT AT G C T TAT GAT AAAT C CTT T GAT AAGC T AT T AAAG C AAAAAAT AGAAA 
TGACAAACC AAAAT AT AAAG C AAGT TGCTTGGTATGTTCCTGCTGCT AAG 
AAAACT CAT AAG AC AGT T GTTGTCGTT CAT GGT T T T GC GAAT AG C AAAG A 
GAATATGAAGGCATATGGTTGGCTGTTTCATAAGTTAGGATACAATGTTC 
TTATGCCTGACAACATTGCACATGGTGAAAGTCATGGGCAGTTGATAGGC 
TAT GG CT GG AAC G AC CG C GAG AAC AT TAT C AAAT G G AC AG AAAT GAT AGT 
GGATAAGAATCCATCAAGCCAAATTaCTTTATTTGGTGTTTCAATGGGTG 
GAGCAACAGTCATGATGGCTAGTGGTGAAAAATTACCTAGTCAGGTTGTT 
AATATCATTGAAGATTGTGGTTATTCTAGTGTTTGGGATGAATTAAAATT 
T C AG GC T AAAG AGAT GT AT G GT T T AC C AG C C T T C C C ACT CT T AT AT G AAG 
T T T C AAC AAT T T C T AAAAT C AG AG C AG GT T TT T C GT AT GGAC AAg C AAGT 
AGTGTCGAACAATTGAAAAAGAATAATTTACCAGCCCTcTTTATTCATGG 
TGAT AAGGAT AATTT TGT T CCAACAAGTATGGT T TATGACAACTAT AAAG 
CTACAGCAGGTAAGAAAGAGCTTTATATTGTAAAAGGGGCAAAACATGCG 
AAAT C T T T T G AAAC AGAG C C AG AAAAAT AT GAG AAACGT AT CT C T AGT T T 
T T T G AAAAAAT AT GAAAAA 

SEQ ID NO. 8307 

STRAIN COH1 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTC 

GAGATGAT AAAT CCTTT AT TTC AAAT GGT CAACGTAAGCCTGGAAACTCT 
T TAT AT G C T TAT GAT AAAT C C T T T GAT AAG C T AT TAAAGC AAAAAAT AG A 
AATGaC AAAC C AAAAT AT AAAGC AAGT TGCTTGGTATGTTCCTGCTGCT A 
AG AAAACT CAT AAG AC AGT TGTTGTCGTT CAT G GT T T T G CG AAT AG C AAA 
GAG AAT AT G AAG G CAT AT GGTTGGCTGTT T CAT AAG T TAG GAT AC AAT G T 
T C T TAT G C CT G AC AAC AT T G C AC AT GGT G AAAGT CAT G GGC AG T T GAT AG 
G C T AT GGC T GG AAC G AC CG C GAG AAC AT TAT C AAAT GGAC AG AAAT GAT A 
GTGGATAAGAATCCATCAAGCCAAATTACTTTATTTGGTGTTTCAATGGG 
T GG AGC AAC AGTC AT G AT GGCT AGT GGTG AAAAAT TACCT AGT CAGGTTG 
TTAATATCATTGAAGATTGTGGTTATTcTAGTGTTTGGGATgAATTAAAA 
TTTCAGGCTAAAGAGATGTATGGTTTACCAGCCTTCCCACTCTTATATGA 
AGT T T C AAC AAT T T C T AAAAT C AG AGC AG G T T T T T C GT AT G G AC AAGC AA 
GTAGTGTCGAACAATTGAAAAAGAATAATTTACCAGCCCTcTTTATTCAT 
GGT GAT AAGG AT AAT T T T G T T C C AAC Aa G TAT GGT T T AT G AC AACT AT AA 
AG C T AC AGC AG G T AAG AAAG AG C T T TAT AT TGT AAAAGG GG C AAAAC AT G 
CGAAATCTTTTGAAaCAGAGCCAGAAAAATATGAGAAACGTATCTCTAGT 
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TTTTTGAAAAAATATGAAAAA 

SEQ ID NO. 8308 

STRAIN M781 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCG 

AGATGATAAATCCTTTATTTCAAATGGTCAACGTAAGCCTGGAAACTCTT 
TAT AT G C T TAT G AT AAAT C CT T T G AT AAG CT AT T AAAG C AAAAAAT AG AA 

ATGACAAACCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGCTAA 
GAAAAC T C AT AAG AC AGT T GT T GT C GT T CAT GGT T T T GC G AAT AG C AAAG 
AG AAT AT GAAG GC AT AT G GT T G GCT G T T T C AT AAGT T AGG AT AC AAT GT T 
CTTATGCCTGACAACATTGCACATGGTGAAAGTCATGGGCAGTTGATAGG 
CTATGGCTGGAACGACCGCGAGAACATTATCAAATGGACAGAAATGATAG 
TGGATAAGAATCCATCAAGCCAAATTaCTTTATTTGGTGTTTCAATGGGT 
GGAGC AAC AGT CAT GAT G G C T AG T GGT G AAAAAT T AC C T AGT C AG GT T G T 
T AAT AT CAT T GAAG AT T GT GG T T AT T c T AGT G T T T G GG AT g AAT T AAAAT 
TTCAGGcTAAAGAGATGTATGGTTTACCAGCCTTCCCACTcTTATATGaA 
GTTTCAacAATTTcTAAAATcAgAGCAGGTTTTTCGTATGGACaAgCAAG 
T Ag T GT CGAAC AAT t G AAAAAG AAT AAT T T AC C AG C C C T c T T TAT T CAT G 
GT GAT AAG GAT AAT T T T G T T C C AAC Aa G TAT G GT T T AT G a C Aa C TAT AAA 

GCTACAGCAGGTAAGAAAGAGCTTTATATTGTAAAAGGGGCAAAACATGC 
GAAAT C T T T T G AAa C AG AG C C AG Aa a AAT AT GAG AAAC G T AT C T CT AGT T 
T T T T G AAAAAAT AT G AAAAA 

SEQ ID NO. 8309 

STRAIN CJB110 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCGAG 
ATGATAAATCCTTTATTTCAAATGGTCAACGTAAGCCTGGAAACTCTTTA 
T AT GC T TAT GAT AAAT C C T T T GAT AAG CT AT T AAAG C AAAAAAT AGAAAT 

GACAAACCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGCTAAGA 
AAACTCATAAGACAGCTGTTGTCGTTCATGGTTTTGCGAATAGCAAAGAG 
AAT AT G AAGG C AT AT GGTTGGCTGTTT CAT AAG T T AG GAT AC AAT GT T c T 
TAT G C C T G AC AAT AT T GC AC AT GGT G AAAG T CAT G G G C AGT T GAT AGG CT 
• AT G G CT G G AAC G AC C G C GAG AAC AT TAT C AAAT G G AC AG AAAT GAT AG T T 
GATAAGAATCCATCAAGCCAAATTACTTTATTTGGTGTTTCAATGGGTGG 
AG C AAC AGT CAT GAT G GC T AG T GGT G AAAAAT T AC C T AGT C AGGT T GT T A 
AT AT CAT T GAAG AT T G C GG T TAT T c T AGTGT T T GG G AT g AAT T AAAAT T T 
C AG G C T AAAGAGAT GT AT G GT T T AC C AG CCT T C C C AC T CT T AT AT GAAG T 
T T C AAC AAT T T C T AAAAT C AG AG C AG GTTTTTCG TAT G G AC AAgC AAGT A 
g T GT CGAAC AAT T G AAAAAG AAT AAT T T AC C AG C C C T c T T T AT T CAT GGT 
GAT AAG GAT AAT T T T GT T C C AAC AAG TAT GGT T TAT GAC AACT AT AAAG C 

TACAGCAGGTAAGAAAGAGCTTTATATTGTAAAAGGGGCAAAACATGCGA 
AAT CT T T T GAAAC AG AG C C AG AAAAAT AT GAG AAAC GT AT CT C T AGT T T T 
TTG AAAAAAT AT GAAAAA 

SEQ ID NO. 8310 

STRAIN 1169NT , 
GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCGA 

GAT GAT AAAT CCTTT ATT TC AAAT GGT CAACGTAAGCCTGGAAACTCTTT 
AT AT GC T T AT GAT AAAT C C T T T GAT AAG C T AT T AAAG C AAAAAAT AG AAA 

TGACAAACCaAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGCTAAG 
AAAACTCATAAGACAGCTGTTGTCGTTCATGGTTTTGCGAAtAGCAAAGA 
g AAT AT GAAGG C AT AT GG TTGGCTGTTT CAT AAG T TAG GAT AC AAT GT T c 
T T AT AC C T GAC AAT AT T G C AC AT GGT G AAAG T CAT G G G C AGT T GAT AG G C 
T AT GGC T G G AAC GAC C G C GAG AAC AT TAT C AAAT G GAC AG AAAT GAT AGT 

TGATAAGAATCCATCAAGCCAAATTACTTTATTTGGTGTTTCAATGGGTG 
GAG C AAC AG T CAT GAT GG C T AGT GGT G AAAAAT T AC C T AGT C AG G T T GT T 

AATATCATTGAAGATTgCGGTTATTcTAGTGTTTGGGATgAATTAAAATT 
TCAGGCTAaAGAGATGTATGGTTTaCCAGCCTTCCCACTcTTATATGAAG 
TTTCAACAATTTCTAAAATCAGAGCAGGTTTTTCGTATGGACAAGCAAGT 
AGT GT AG AAC AAT T G AAAAAG AAT AAT T T AC C AG C C C T CT T TAT T CAT GG 
T G AT AAGG AT AATT T T GT T C C AAC AAGT AT GGT T T AT GAC AAC TAT AAAG 
CT AC AG C AG GT AAG AAAG AG C T T TAT AT T GT AAAAG G GG C AAAAC AT G C G 
AAAT CT T T T G AAa C AG AG C C AG AAAAAT AT GAG AAAC GT AT CT C T AGT T T 
T T T G AAAAAAT AT GAAAAA 
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SEQ ID NO. 8311 

STRAIN OM9130013 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCG 

AGATGATAAATCCTTTATTTCAAATGGTCAACGTAAGCCTGGAAACTCTT 
TAT AT G CT TAT G AT AAAT C CT T T G AT AAG C T AT T AAAG C AAAAAAT AGAA 
ATGaCAAACCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGTTAA 
GAAAACTCATAAGACAGCTGTTGTCGTT CAT GGTTTTGCGAATAGC AAAG 
AGAATATGAAGGCATATGGTTGGCTGTTTCATAAGTTAGGATACAATGTT 
CTTATGCCTGACAATATTGCACATGGTGAAAGTCATGGGCAGTTGATAGG 
C T AT GGC T GG AACGAC C G C GAGAAC AT TAT C a AAT GG AC AG AAAT GAT AG 
TTGATAAGAATCCATCAAGCCAAATTaCTTTATTTGGTGTTTCAATGGGT 
GGAGCAACAGTCATGATGGCTAGTGGTGAAAAATTACCTAGTCAGGTTGT 
T AAT AT CAT T G AAG AT T G C G GT TAT T cT AGT GT T T GGG AT g AAT T AAAAT 
T T C AGGCT AAAG AG AT GT AT GGT T T AC C AG C CT T C C C ACT C T TAT AT G AA 
GT T T C AAC AAT T T CT AAAAT C AG AG C AG G T T T T T C GT AT GGAC AAGC AAG 
TAG T GT CG AAC AAT T G AAAAAG AAT AAT T T AC C AGC C C T CT T TAT T CAT G 
GT G AT AAGGAT AAT T T T GT T C C AAC AAGT AT G GT T TAT G AC AAC TAT AAA 
GCTACAGCAGGTAAGAAAGAGCTTTATATTGTAAAAGGGGCAAAACATGC 
GAAATCTTTTG AAACAGAGC CAGAAAAAT AT GAGAAACGT AT CT CT AGT T 
T T T T G AAAAAAT AT G AAAAA 

SEQ ID NO. 8312 
STRAIN 2 603 frame: 1 

MKKIRLSKFIKMIWILFLISVAASFYFFHVAQVRDDKSFISNGQRKPGNSLYAYDKSFD 
KLLKQK I EMTNQN I KQ VAW Y V P AVKKT HKT AVWHG FAN S KENMKAYGWL FHKLGYN VLM 
PDNIAHGESHGQLIGYGWNDRENIIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKL 
PSQVVNIIEDCGYSSVWDELKFQAKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQL 

KKNNLPALFIHGDKDNFVPTSMVYDNYBCATAGKKELYIVKGAKHAKSFETEPEKYEKRIS 
SFLKKYEK 

SEQ ID NO. 8313 

STRAIN 090 frame: 1 

ASFYFFHVAQVRDDKSFISNGQRKPGNSLYAYDKSFDKLLKQKIEMTNQNIKQVAWYVPA 
AKKTHKT AVWHG FAN S KENMKAYGWL FHKL G YN VLM P DN I AHGE S HGQL I G YGWN DREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQVVNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVS T I SKIRAGFS YGQAS SVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDN YKATAGKKELY I VKGAKHAKS FETE PEKYEKRI S S FLKKYEK 

SEQ ID NO. 8314 

STRAIN A909 frame: 3 

SFISNGQRKPGNSLYAYDKSFDKLLKQKIEMTNQNIKQVAWYVPAAKKTHKTAVWHGFA 
NSKENMKAYGWLFHKLGYNVLMPDNIAHGESHGQLIGYGWNDRENIIKWTEMIVDKNSSS 
QITLFGVSMGGATVMMASGEKLPSQWNIIEDCGYSGVWDELKFQAKEMYGLPAFPLLYE 
V S T I S K I RAG F S Y GQ A S S VE Q LKKNNL PAL F I HG DK DN FV PT S MV Y DN YKAT AGKKE L Y I 
VKGAKHAKS FETE PEKYEKRI S S FLKKYEK 

SEQ ID NO. 8315 

STRAIN H3 6B frame: 1 

S FYFFHVAQVRDDKS FI SNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPAA 
KKTHKTAVWHGFAN SKENMKAYGWL FHKLGYN VLMP DN I AHGE S HGQL I GYGWN DREN I 
IKWTEMIVDKNSSSQITLFGVSMGGATVMMASGEKLPSQVVNIIEDCGYSGVWDELKFQA 
KEMYGLPAFPLLYEVSTISKIRAGFS YGQAS SVEQLKKNNLPALFIHGDKDNFVPTSMVY 
DNYKATAGKKELYI VKGAKHAKS FETE PEKYEKRI SS FLKKYEK 

SEQ ID NO. 8316 

STRAIN 18RS21 frame: 1 

AS FYFFHVAQVRDDKS FISNGQRKPGNSLYAYDKSFDKLLKQKIEMTNQNIKQVAWYVPA 
VKKTHKT AVWHG FANS KENMKAYGWL FHKLG YNVLMPDN I AHGE S HGQL I GYGWNDREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQWNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTI SKIRAGFS YGQAS SVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDN YKAT AGKKELY I VKGAKHAKS FETE PEKYEKRI S S FLKKYEK 

SEQ ID NO. 8317 
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STRAIN M732 frame: 1 

AS FYFFHVAQVRDDKS FI SNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
AKKTHKTVVWHGFANSKENMKAYGWLFHKLGYNVLMPDNIAHGESHGQLIGYGWNDREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQVVNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 

Y DN YKAT AGKKE L Y I VKGAKHAK S FE T E P E KYE KR I S S FLKK YE K 

SEQ ID NO. 8318 

STRAIN COH1 frame: 1 

AS FY FFHVAQVRDDKSFI SNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
AKKT HKTWWH G FAN S KE NMKAY GWL FHKL G YN VLM P DN I AHGE S HG QL I G YGWN DREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQVWIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDNYKATAGKKELYI VKGAKHAKS FETE PEKYEKRI S S FLKKYEK 

SEQ ID NO. 8319 

STRAIN M781 frame: 1 

AS FYFFHVAQVRDDKS FI SNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
AKKTHKTWWHGFANSKENMKAYGWLFHKLGYNVLMPDNIAHGESHGQLIGYGWNDREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQWNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDNYKATAGKKELYI VKGAKHAKS FETE PEKYEKRI S S FLKKYEK 

i 

SEQ ID NO. 8320 

STRAIN CJB110 frame: 1 

AS FYFFHVAQVRDDKSFISNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
AKKTHKTAWVHGFANSKENMKAYGWLFHKLGYNVLMPDNIAHGESHGQLIGYGWNDREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQVVNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 

Y DN YKAT AGKKE L Y I VKGAKHAK SFETEPEKYE KR I S S FLKK YE K 

SEQ ID NO. 8321 

STRAIN 1169NT frame: 1 

AS FYFFHVAQVRDDKS FI SNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
AKKTHKTAWVHGFANSKENMKAYGWLFHKLGYNVLIPDNIAHGESHGQLIGYGWNDREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQWNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDNYKATAGKKELYI VKGAKHAKS FETEPEKYEKRISS FLKKYEK 

SEQ ID NO. 8322 

STRAIN JM9130013 frame: 1 

AS FYFFHVAQVRDDKSFISNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
VKKT HKT AVWHG FAN S KENMKAYGWL FHKLG YN VLM P DN I AHGE S HGQLI G YGWN DREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQVVNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDNYKATAGKKELYIVKGAKHAKS FETEPEKYEKRISS FLKKYEK 

SEQ ID NO. 8401 
STRAIN 2603 

ATGATGAAAGTTTTAGCCTTTGATACTTCAAGCAAAGCACTATCAGTGGCTGTACTAAAC 
AATATGGAATGTTTAGCGACTGTCACTATCAATATCAAAAAGAATCATAGCATTAATTTG 
AT GC C AG C CAT T GAT T T T T T AAT G C AAT C AAT T GAT T T AG AAC C T C AAG AT T T GG AC CGT 
AT C GT AGT AGC AG AGG G T C C AG GAT C T T AT AC GG G CT T AC G T GT AG CT GT T GC T AC AG C A 
AAAATGCTAGCTTATACGCTTAAGATTGACTTAGTTGGAGTATCTAGCCTGTACGCTTTA 
AC AAAT GG AT T T T C AGAAAAT GAT T T AT T G GT AC C ACT TAT AG AT G C AC G ACGT AAT AAT 
GTTTATGTTGGTTTCTATCAAAATGGTGATACTGTTAAACCAGACTGTCACACTTCTCTT 
G AAG AAGT C T T AC AAG AG G T GG GG AAT AAAG C C AAT GT T CAT T T T G T C G G AG AGGT T G C A 
G C AT T T T T T GAT C AG AT T AAG AAAG C C T T AC C AC AT G C T AAAAT T AC AG AAAC T T T AC CT 
TGTGCAGTAGCAATTGGGCGCAAAGGACAAAAAATGAAAAGCGTTAATGTAGATGCGTTT 
GTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATTGGTTAAAAAACCACTGTGAA 
ACGAAT AC AGAAGAAT ATATT AAG AGAGT T 

SEQ ID NO. 8402 

STRAIN 0 90 
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AAAGT T T T AGC CT T T GAT AC T T C AAG C AAAG C AC TAT C AGT G G C T GT ACT 
AAAC AAT AT GGAAT GT T T AGC G ACT GT C ACT a T C AAT AT CAAAAAGAAT C 
AT AG CAT T AAT T T G AT GC C AG C CAT T GAT T T T T T AAT G C AAT C AAT T GAT 

TTAGAACCTCAAGATTTGGACCGTATCGTAGTGGCAGAGGGTCCAGGATC 
T T AT AC GGGCT T AC GT GT AGC T GT T G C TAC AG C AAAAATGCT AG CT TATA 

CGCTTAAGATTGACTTAGTTGGAGTATCTAGCCTGTACGCTTTAACAAAT 
GGAT T T T C AG AAAAT GAT T T GT T G GT AC C ACT T AT AGAT G C ACG AC GT AA 
C AAT GT T T AT GT T GGT T T C T AT C AAAAT G GT GAT AC T GT T AAAC C Ag AC T 

GTCACACTTCTCTTGAAGAAGTCTTACAAGAGGTGGGGAATAAAGCCAAT 
GT T CAT T T T GT C GGAG AGGT T G C AGC AT T T T T T GAT C AGAT T AAg AAAGC 
CT TAC C AC AT G C T AAAAT T AC AGAAAC T T TAC C T T G T G C AGT GG C AAT T G 
GGCGCAAAGGACAAAAAATGGAAAGCGTTAATGTAGATGCGTTTGTTCCA 
C GAT ACT T AAAAC G AGT T G AAGCT GAGG AAAAT T GGT T AAAAAAC C AC T G 
TGAAACGAAT 

SEQ ID NO. 8403 

STRAIN A909 

AAAGT TT TAGCCTT T GAT ACTT CAAGC AAAGC ACT AT C AG 
T GG C T GT AC T AAAC AAT AT G GAAT GT T TAG C G AC T GT C AC TAT C AAT AT C 
AAAAAG AAT CAT AG CAT T AAT T T GAT G C C AG C CAT T GAT T T T T T AAT G C A 
AT C AAT T GAT T T AGAAC C T C AAGAT T T GGAC C GT AT C GT AGT AGC AG AGG 

GTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAATG 
C T AG CT T AT AC G C T T AAG AT T GACT T AGT T GGAG TAT C T AGC CT GT ACG C 
T T T AAC AAAT GGAT T T T C AG AAAAT GAT T TAT T G GT AC C AC T TAT AG AT G 
CACGACGT AACAATGTTTATGT TGGTTT CT AT CAAAATGGAGAT ACTGT T 
AAAC C AGAC T GT C AC AC T T C T C T T GAAG AAGT C T T AC AAGAGGT G GG G AA 
TAAAGCCAATGTTCATTTTGTCGGAgAGGTTGCAgCATTTGTTGACCAGA 
tTAAgAAAGTTTTACCACATGCTAAAATTACAGAAACTTTACCTTGTGCA 
GtGGCAATTGGGCGCAAAGGACAAAAAATGAAAAGCGTTAATGTAGATGC 
GTTTGTTCCAC GAT ACTT AAAACGTGTTGAAGCTGAGGAAAATT GGT TAA 
GAAACCACTGTG AAAC GAAT 

SEQ ID NO. 8404 

STRAIN H3 6B 

AAAGT T T T AGC CT T T GAT ACT T CAAGC AAAG C AC TAT C A 
GTGGCTGTACTAAACAATATGGAATGTTTAGCGACTGTCACTATCAATAT 
CAAAAAGAAT CAT AGCATT AATTTGATGCCAGCCATT GATT T T TT AAT GC 
AAT C AAT T GAT T TAG AAC CT C AAG AT T T GG AC C GT AT C G TAG TAG C AG AG 

GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 
G CT AG C T T AT AC GC T T AAGAT T G AC T T AGT T GG AGT AT CT AG C C T GT AC G 
CT T T AAC AAAT GGAT T T T C AG AAAAT GAT T TAT T G GT AC C ACT TAT AG AT 
GC AC GAC GT AAC AAT GT T T AT GT T G GT T T C TAT C AAAAT G GAG AT ACT GT 
T AAAC C AG AC T G T C AC ACT T CT C T T GAAG AAGT C T T AC AAG AG GT G GG G A 
AT AAAG C C AAT GT T CAT T T T G T C GGAG AG G TT G C AG CAT T T G T T GAC C AG 
AT T AAG AAAGT T T T AC C AC AT G C T AAAAT TAC AG AAAC T T TAC C T T GT G C 
AGT GG C AAT T G GG C G C AAAG G AC AAAAAAT G AAAAG C G T T AAT GT AG AT G 
CGTTTGTTC C AC GAT AC T T AAAAC GT GT T G AAGC T GAG G AAAAT T G GT T A 
AGAAAC C AC T GT G AAAC G AAT AC AG AAGAAT AT AT T AAG AG AG T T 

SEQ ID NO. 8405 

STRAIN 18RS21 

AAAG T T T TAG C CT T T GAT ACT T C AAG C AAAG C AC TAT C A 
GT GG C T GT AC T AAAC AAT AT G GAAT GT T T AGC GACT GT C AC TAT C AAT AT 
CAAAAAGAAT CAT AGCATT AATT T GAT GC CAGCCAT TGAT T T T TTAATGC 
AAT C AAT T GAT T TAG AAC C T C AAG AT T T GG AC CG T AT C G T AGT AG C AGAG 
GGT C CAGGAT CTT AT ACGGGCT TACGTGTAGCTGTT GCT AC AGC AAAAAT 
GCTAGCTTATACGCTTAAGATTGACTTAGTTGGAGTATCTAGCCTGTACG 
CTTTAACAAATGGATTTTCAGAAAATGATTTATTGGTACCACTTATAGAT 
GCACGACGTAATAATGTTTATGTTGGTTTCTATCAAAATGGTGAT ACTGT 
T AAAC C AG ACT G T C AC AC T T CT C T T GAAG AAGT CT T AC AAGAGG T GG GG A 
ATAAAGCCAATGTTCATTTTGTCGGAGAGGtTGCAGCATTTTTTGATCAg 
AT T AAg AAAG C CT T AC C AC AT G C T AAAAT TAC AG AAAC T T T AC C T T G T G C 
AGT AG C AAT T G GG C G CAAAGG AC AAAAAAT GAAAAG C G T T AAT GT AG AT G 
CGTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATTGGTTA 
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AAAAAC C AC T GT G AAAC GAAT AC AG AAG AAT AT AT T AAG AGAGT T 

SEQ ID NO. 8406 

STRAIN M732 

AAAGT T T TAG C CT TT GAT AC T T C AAG C AAAGC ACT AT C A 

GTGGCTGTACTAAACAATATGGAATGTTTAGCGACTGTCACTATCAATAT 
C AAAAAG AAT CAT AG CAT T AAT T T GAT G C C AG C CAT T GAT T T T T T AAT G C 
AAT CAAT T G ATT T AG AAC CT C AAG AT T T GG AC C GT AT C GT AGT AG C AG AG 
GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 
G C T AG CT T AT ACG CT T AAG AT T G AC T T AGT T G G AGT AT C TAG C C TGT AC G 
C TT T AAC AAAT GG AT T T T C AGAAAAT GAT T T AT T G GT AC C AC T TAT AG AT 
GCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGTGATACTGT 
TAAACCAGACTGTCACACTTCTCTTGAAGAAGTCTTACAAGAGGTGGGGA 
ATAAAGCCAATGTTCATTTTGTCGGAGAGGTTGCAGCATTTTTTGATCAG 
AT T AAGAAAG C CT T AC C AC AT G C T AAAAT T AC AG AAAC T T T AC CT T G T G C 
AG TAG CAAT T GGG CG C AAAG G AC AAAAAAT G AAAAG C G T T AAT GT AG Ann 
CGTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATTGGTTA 
AAAAAC C AC TGT G AAAC GAAT AC AG AAGAAT AT AT T AAG AG AGT T 

SEQ ID NO. 8407 

STRAIN COH1 

AAAGT TTT AG CC TTT GAT AC TTCAAGC AAAGC AC 

TATCAGTGGCTGTACTAAACAATATGGAATGTTTAGCGACTGTCACTATC 
AAT AT C AAAAAG AAT CAT AG CAT T AAT TT GAT G C C AG C CAT T GAT T T T T T 
AAT G CAAT CAAT T GAT T TAG AAC C T C AAG AT T T GG AC C G T AT C G T AGT AG 

CAGAGGGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCA 
AAAAT G C TAG C T T AT AC GC T T AAG AT T G AC T T AGT T G G AGT AT C T AG C C T 

GTACGCTTTAACAAATGGATTTTCAGAAAATGATTTATTGGTACCACTTA 
TAGATGCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGTGAT 
ACTGTTAAACCAGACTGTCACACTTCTCTTGAAGAAGTCTTACAAGAGGT 
GGGGAATAAAGCCAATGTTCATTTTGTCGGAGAGGTTGCAGCATTTTTTG 
ATCAGATTAAGAAAGCCTTACCACATGCTAAAATTACAGAAACTTTACCT 
T GT G C AGT AGC AAT T GGG C GC AAAG G AC AAAAAAT G AAAAG CGT T AAT GT 
AGATGCGTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATT 
GGT T AAAAAAC C ACT GT G AAAC G AAT AC AGAAGAAT AT AT T AAG AG AGT T 

SEQ ID NO. 8408 

STRAIN M781 

AAAGTTTTAGCCTTTGATACTTCAAGCAAAGCACTA 

T C AGT GGC T G TACT AAAC AAT AT G G AAT GT T T AG CGACT G T C AC TAT C AA 
TAT C AAAAAG AAT CAT AG C AT T AAT T T GAT G C C AG C CAT T GAT TTT T T AA 
T GCAAT CAAT T G ATT T AGAAC CT C AAG AT T T GGAC C GT AT C GT AGT AT C A 
G AGGGT C C AG GAT CT T AT AC GGG C T T AC G T GT AG C T GT T G C T AC AG C AAA 
AAT G CT AG CT TAT AC G C T T AAG AT T G AC T T AGT T GG AGT AT C T AG C C T GT 
ACGC T TT AAC AAAT G GAT TTT C AG AAAAT GAT T T AT T G G T AC C AC T T AT A 
GATGCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGTGATAC 
T GT T AAAC C AG AC TGT C AC AC T T CT CT T G AAG AAGT CT T AC AAG AGGT G G 

GGAATAAAGCCAATGTTCATTTTGTCGGAGAGGTTGCAGCATTTTTTGAT 
C AG AT T AAG AAAG C C T T AC C AC AT G C T AAAAT T AC AGAAACT TTACCTTG 
T G C AG TAG CAAT T GGG C G CAAAGG AC AAAAAAT G AAAAG C GT T AAT GT AG 
ATGCGTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATTGG 
T T AAAAAAC C AC T G T G AAAC GAAT AC AG AAG AAT AT AT T AAG AG AG T T 

SEQ ID NO. 8409 

STRAIN CJB110 

AAAGTTTTAGCCTTTGATACTTCAAGCAAAGCACTATCA 
GTGGCTGtaCTAAACAATATGGAATGTTTAGCGACTGTCACTATCAATAT 
C AAAAAGAAT C AT AGCAT T AATT T GATGCCAGCC AT TGAT T T TT T AAT GC 
AAT C AAT T GAT TT AGAAC CTCAAGATTTGGACCGT AT CGT AGTGGCAGAG 
GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 
GCT AG C T T AT AC GC T T AAG AT T G AC T T AGT T GG AGT AT C T AG CCTGTACG 
C T T T AAC AAAT G GAT TTT C AG AAAAT GAT TTGTTGGTAC C AC T T AT AG AT 
GCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGTGATACTGT 
T AAAC C AG AC TGT C AC AC T T C T C T T G AAG AAGT C T T AC AAG AGGT G G G GA 
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ATAAAGCCAATGTTCATTTTGTCGGAGAGGTTGCAGCATTTTTtgATCAG 
ATTAAGAAAGCCTTACCACATGCTAAAATTACAGAAACTTTACCTTGTGC 
AGT GGCAATT GGGCGCAAAGGACAAAAAATGGAAAGCGTT AATGTAgAT G 
C GT T T GT T C C AC GAT AC T T AAAACG AGT T G AAG C T GAG GAAAAT T GGT T A 
AAAAACCACTGTGAAACGAATACAGAAGAATATAT TAAGAGAGT T 

SEQ ID NO. 8410 

STRAIN 1169NT 

AAAGTTTTAGCCTTTGATACTTCAAGCAAAGCACTATCA 

GT G GCT GT AC T AAAC AAT AT G GAAT GT T TAG C GACT G T C AC TAT C AAT AT 

CAAAAAGAATCATAGCATTAATTTGATGCCAGCCaTTGATTTTTTAATGC 

AAT CAATTG ATT T AGAACCT CAAGAT TTGGACCGT AT CGT AGT AGCAGAG 

GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 

GCTAGCTTATACGCTTAAGATTGACTTAGTTGGAGTATCTAGCCTGTACG 

C T T T AAC AAAT GG AT T T T C AGAAAAT GAT T TAT T GGT AC C ACT T AT AGAT 

GCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGTGATACTGT 

TAAACCAGACTGTCACACTTCTCTTGAAGAAGTCTTACAAGAGGTGGGGA 

AT AAAGC C AAT GT T CAT T T T GT CGGAg AGGT T G C AGC AT T T GT T G AC C AG 

AT T AAG AAAGC T T T ACC AC At G CT AAAATT AC AG AAAC T T TAG C T T GT G C 

AGTGGCAATTGGGCGCAAAGGACAAAAAATGGAAAGCGTTAATGTAgATG 

CGTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAgGAAAATTGGTTA 

AAAAAC C AC T GT GAAACG AAT AC AGAAG AAT AT AT TAAGAGAGT T 

SEQ ID NO. 8411 

STRAIN JM9130013 

AAAGTTTTAGCCTTTGATACTTCAAGCAAAGCACTATCA 
GTGGCTGTACTAAACAATATGGAATGTTTAGCGACTGTCACTATCAATAT 
C AAAAAG AAT CAT AG CAT T AAT T T GAT G C C AG C C AT T GAT T T T T T AAT GC 
AAT C AAT T GAT T T AGAAC C T CAAGAT T T GG AC C GT AT C GT AGT AGC AG AG 
GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 
gCTAGCTTATACGCTTAAGATTGACTTAGTTGGAGTATCTAGCCTGTACG 
C T T T AAC AAAT G GAT T T T C AGAAAAT GAT T TAT T G GT AC C ACT TAT AG AT 
GCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGAGATACTGT 
TAAACCAGACTGTCACACTTCTCTTGAAGAAGTCTTACAAGAGGTGGGGA 
AT AAAG C C AAT G T T CAT T T T GT CGG AGAG G T T GC AG C AT T T GT T G AC C AG 
AT T AAG AAAGT T T TAG C AC AT G CT AAAAT T AC AG AAAC T T T AC C T T GT GC 
AGTGGCAATTGGGCGCAAAGGACAAAAAATGAAAAGCGTTAATGTAGATG 
CGTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATTGGTTA 
AG AAAC C AC T GT GAAAC GAAT AC AGAAG AAT AT AT T AAG AG AG T T 

SEQ ID NO. 8412 
STRAIN 2 603 frame: 1 

MMKVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDR 
IWAEGPGSYTGLRVAVATAKMLAYTLKIDLVGVSSLYALTNGFSENDLLVPLIDARRNN 
VYVGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLP 
CAVAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8413 

STRAIN 090 frame: 1 

KVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VAEGPGSYTGLRVAVATAKMLAYTLKIDLVGVSSLYALTNGFSENDLLVPLIDARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLPCA 
VAI GRKGQKME S VNVDAFVPRYLKRVE AEEN WLKNHCETN 

SEQ ID NO. 8414 

STRAIN A909 frame: 1 

BCVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VAEGPGSYTGLRVAVATAKMLAYTLKIDLVGVSSLYALTNGFSENDLLVPLIDARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFVDQIKKVLPHAKITETLPCA 
VA I G RKGQKMK S VN V D A FV PR Y LKR VE AE E N W LRNH C E T N 

SEQ ID NO. 8415 

STRAIN H3 6B frame: 1 

KVLAFDTSSKALSVAVLNNMECLATVTINIKECNHSINLMPAIDFLMQSIDLEPQDLDRIV 
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VAEGPGS YTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFVDQIKKVLPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLRNHCETNTEEYIKRV 

SEQ ID NO. 8416 

STRAIN 18RS21 frame: 1 

KVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VAEGPGS YTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8417 

STRAIN M732 frame: 1 

KVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VAEGPGS YTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLIDARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKPCALPHAKITETLPCA 
VAIGRKGQKMKSWVXXFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8418 

STRAIN COH1 frame: 1 

KVLAFDT S SKALS VAVLNNMECLAT VT IN I KKNHS INLM PAI D FLMQS I DLE PQDL DRI V 
VAEGPGSYTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLIDARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8419 

STRAIN M7 81 frame: 1 

KVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VSEGPGSYTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8420 

STRAIN CJB110 frame: 1 

KVLAFDTS SKALS VAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VAEGPGSYTGLRVAVATAKMLAYTLKIDLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLPCA 
VAIGRKGQKMESVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8421 

STRAIN 1169NT frame: 1 

KVLAFDTS SKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQS I DLE PQDLDRIV 
VAEGPGSYTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFVDQIKKALPHAKITETLPCA 
VAIGRKGQKMESVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8422 

STRAIN JM9130013 frame: 1 

KVLAFDTS SKALS VAVLNNMECLATVTINIKKNHS INLMPAI DFLMQS I DLE PQDLDRIV 
VAEGPGSYTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNPCANVHFVGEVAAFVDQIKKVLPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLRNHCETNTEEYIKRV 

SEQ ID NO. 8501 
STRAIN 2603 

atgagtaaacgacaaaatttaggaattagtaaaaaaggagcaattatatcagggctctca 
gtggcactaattgtagtaataggtggctttttatgggtacaatctcaacctaataagagt 
gcagtaaaaactaactacaaagtttttaatgttagagaaggaagtgtttcgtcctcaact 
cttttgacaggaaaagctaaggctaatcaagaacagtatgtgtattttgatgctaataaa 
ggtaatcgagcaactgtcacagttaaagtgggtgataaaatcacagctggtcagcagtta 
gttcaatatgatacaacaactgcacaagcagcctacgacactgctaatcgtcaattaaat 
aaagtagcgcgtcagattaataatctaaagacaacaggaagtcttccagctatggaatca 
agtgatcaatcttcttcatcatcacaaggacaagggactcaatcgactagtggtgcgacg 
aatcgtctacagcaaaattatcaaagtcaagctaatgcttcatacaaccaacaacttcaa 
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gatttgaatgatgcttatgcagatgcacaggcagaagtaaataaagcacaaaaagcattg 
aatgatactgttattacaagtgacgtatcagggacagttgttgaagttaatagtgatatt 
gatccagcttcaaaaactagtcaagtacttgtccatgtagcaactgaaggtaaactccaa 
gtacaaggaacgatgagtgagtatgatttggctaatgttaaaaaagaccaggctgttaaa 
ataaaatctaaggtctatcctgacaaggaatgggaaggtaaaatttcatatatctcaaat 
tatccagaagcagaagcaaacaacaatgactctaataacggctctagtgctgtaaattat 
aaatataaagtagatattactagccctctcgatgcattaaaacaaggttttaccgtatca 
gttgaagtagttaatggagataagcaccttattgtccctacaagttctgtgataaacaaa 
gataataaacactttgtttgggtatacaatgattctaatcgtaaaatttccaaagttgaa 
gtcaaaattggtaaagctgatgctaagacacaagaaattttatcaggtttgaaagcagga 
caaatcgtggttactaatccaagtaaaaccttcaaggatgggcaaaaaattgataatatt 
gaatcaatcgatcttaactctaataagaaatcagaggtgaaa 

SEQ ID NO. 8502 

STRAIN 090 

T T T T T AT GG GT AC AAT C T C AAC C T AAT AAG AGT G C AGT AAAAAC T AAC T A 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
C AG G AAAAG C T AAGG CT AAT C AAGAAC AGT AT G T GT AT T T T GAT G CT AAT 

AAAGGTAATCGAGCAACTGTCACAGTTAAAGTGGGTGATAAAATCACAGC 
T G GT C AG C AGT TAG T T C AAT AT GAT AC AAC AAC T G C AC AAGC AG CC T ACG 
AC AC T G C T AAT C GT C AAT T AAAT AAAGT AG C GC GT C AGAT T AAT AAT CT A 
AAGACAACAGGAAGTCTTCCAGCTATGGAATTAAGTGATCAATCTTCTTC 
AT CAT C AC AAGGAC AAGGG AC T C AAT CG AC T AG T GGT G C G AC GAAT C GT C 
TACAGCAAAATTATCAAAGTCAAGCTAATGCTTCATACAACCAACAACTT 
CAAGATTTGAATGATGCTTATGCAGATGCACAGGCAGAAGTAAATAAAGC 
AC AAAAAGC AT T GAAT GAT ACT GT TAT T AC AAGT G AC GT AT C AGG G AC AG 
T T GT T G AAGT T AAT AGT GAT AT T GAT C C AG C T T C AAAAAC T AGT CAAGT A 
C T T GT C C AT GT AG CAACT G AAG GT AAAC T C C AAG T AC AAGG AAC GAT GAG 
TGAGTATGATTTGGCTAATGTTAAAAAAGACCAGGCTGTTAAAATAAAAT 
C T AAG GT C TAT C C T GAC AAGG AAT GGGAAG G T AAAAT T T CAT AT AT C T C A 
AATTATCCAGAAGCAGAAGCAAACAACAATGACTCTAATAACGGCTCTAG 
T G CT G T AAATT AT AAAT AT AAAG T AG AT AT T AC TAG C C C T C T C GAT G C AT 
T AAAAC AAG G T T T T AC C GT AT C AGT T G AAGT AGT T AAT GGAGAT AAGC AC 
CTTATTGTCCCTACAAGTTCTGTGATAAACAAAGATAATAAACACTTTGT 
T T G GGT AT AC AAT GAT T CT AAT CG T AAAAT T T C C AAAGT T G AAGT C AAAA 
T T GG T AAAGC T G AT GC T AAG AC AC AAGAAAT T T TAT C AG GT T T GAAAG C A 
GGAC AAAT CGTGGTT ACT AAT CC AAGT AAAACCTTCAAGGATGGGC AAAA 
AAT T GAT AAT AT T GAAT C AAT C GAT C T T AAC T C T AAT AAG AAAT C AG AG G 

SEQ ID NO. 8503 

STRAIN A909 

T T T T T AT G GGT AC AAT C T C AAC CT AAT AAGAGT G C AGT AAAAAC T AA 
CTACAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTT 
T GAC AG GAAAAG C T AAGGCT AAT C AAGAAC AGT AT G T G T AT T TT GAT G CT 

AAT AAAGGT AAT CGAGCAACTGTT AC AGTTAAAGTGGGTGAT AAAAT CAC 
AG C T G G T C AG C AGT TAG T T C AAT AT GAT AC AAC AAC T G C AC AAG C AG C C T 
AC GAC AC T G CT AAT CG T C AAT T AAAT AAAGT AGC G C GT C AG AT T AAT AAT 
C T AAAG AC AAC AGG AAGT C T T C C AGC T AT G GAAT C AAG T G AT C AAT C T T C 
AT CAT CAT CAC AAGG AC AAG G G G C T C AAT C G ACT AG T G GT G C G ACG AAT C 
GT C T AC AG C AAAAT TAT C AAAGT CAAGCT AAT G C T T CAT AC AAC C AAC AA 
CTTCAAGATTTGAATGATGCTTATGCAGATGCACAGGCAGAAGTAAATAA 
AGC AC AAAAAGC AT T GAATG AT ACT GTT ATT AC AAGT GACGT AT C AG GG A 
CAGTTGTTGAAGTTAATAGTGATATTGATCCAGCTTCAAAAACTAGTCAA 
GT AC T T GT C CAT GT AG CAACT G AGG G T AAAC T C CAAGT AC AAG G AAC GAT 
GAGTGAGTATGATTTGGCTAATGTTAAAAAAGACCAGTCTGTTAAAATAA 
AAT C T AAGGT C T AT CC T GAC AAG GAAT GGG AAGG T AAAAT T T CAT AT AT C 

TCAAATTATCCAGAAGCAGAAGCAAACAACAATGACTCTAATAACGGCTC 
TAGTGCTGTAAATTATAAATATAAAGTAGATATTACTAGCCCTCTCGATG 
CAT TAAAACAAGGT T TT ACT GT AT CAGTTGAAGT AGT T AAT GGAGAT AAG 
CACCTTATTGTTCCTACAAGTTCTGTGACAAACAAAGATAATAAACACTT 
TGTTTGGGTATACAATGATTCTAATCGTAAAATTTCCAAAGTTGAAGTCA 
AAAT T GGT AAAG CT G AT G C T AAG AC AC AAG AAAT T T T AT C AG GT T T G AAA 
G C AG GAC AAAT C GT GGT T AC T AAT C C AAG C AAAAC T T T C AAG GAT GGG C A 
AAAAAT T GAT AAT AT T GAAT C AAT AGAT CT T AAGT CT AAT AAGAAAT CAG 
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AGGTGAAA 

SEQ ID NO. 8504 

STRAIN H3 6B 

TTTTTATGGGTACAATCTCAACCTAATAAGAGTGCAGTAAAAACTAATTA 
CAAAGTTT TTAATGTT AGAGAAGGAAGT GTTT CGT CCT CAACTCTTT T GA 
CAGGAAAAGCTAAGGCTAATCAAGAACAGTATGTGTATTTTGATGCTAAT 
AAGGGTAATCGAGCAACTGTTACAGTTAAAGTGGGTGATAAAATCACAGC 
T G G T C AG C AGT T AGT T C AAT AT G AT AC AAC AACT G C AC AAGC AG C C T AC G 
ACACTGCTAATCGTCAATTAAATAAAGTAGCGCGTCAGATTAATAATCTA 
AAGACAACAGGAAGT CTT CCAGCT ATGGAAT C AAGT GAT CAAT CTT CAT C 
ATCATCACAAGGACAAGGGACTCAATCGACTAGTGGTGCGACGAATCGTC 
TACAGCAAAAT T AT CAAAGT CAAGCT AATGCTT CAT AC AAC CAAC AACTT 
C AAG AT T T GAAT GAT G CT TAT GC AG AT G C AC AGG C AGAAG T AAAT AAAG C 
AC AAAAAGC AT T G AAT GAT AC T GT T AT T AC AAGT G AC GT AT C AGGGAC AG 
T T GT T G AAGT T AAT AGT GAT AT T GAT C C AGC T T C AAAAAC T AGT C AAGT A 
CTTGTCCATGTAGCAACTGAAGGTAAACTCCAAGTACAAGGAACGATGAG 
TGAGTATGATTTGGCTAATGTAAAAAAAGACCAGGCTGTTAAAATAAAAT 
CTAAGGTCTATCCTGACAAGGAATGGGAAGGTAAAATTTCATATATCTCA 
AAT TAT C C AGAAG C AGAAG C AAAC AAC AAT G ACT C T AAT AAC GG CT C TAG 
TGCTGTAAATTATAAATATAAAGTAGATATTACTAGCCCTCTCGATGCAT 
T AAAAC AAGG T T T T AC T G T AT C AGT T G AAGT AGT T AAT GG AG AT AAGC AC 
C T T AT T GT T C CT AC AAGT T C T GT G AC AAAC AAAG AT AAT AAAC ACT T T GT 
TTGGGTATACAATGATTCTAATCGTAAAATTTCCAAAGTTGAAGTCAAAA 
T T G GT AAAG C T GAT G C T AAGAC AC AAG AAAT T T T AT C AGGT T T G AAAG C A 
G GAC AAAT CG T AGT TACT AAT C C AAG T AAAG C T T T C AAG GAT GG G C AAAA 
AAT T GAT AAT AT T GAAT CAAT C GAT CTT AAGT C T AAT AAG AAAT C AG AG G 
TG 

SEQ ID NO. 8505 

STRAIN 18RS21 

TTTTTATGGGTACAATCTCAACCTAATAAGAGTGCAGTAAAAACTAACTA 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
C AG GAAAAG CT AAGG C T AAT C AAG AAC AGT AT GT GT ATT T T G ATG C T AAT 
AAAG GT AAT C G AGC AACT G T C AC AG T T AAAGT GG GT GAT AAAAT C AC AG C 
T G G T C AG C AGT T AGT T CAAT AT GAT AC AAC AAC T GC AC AAG C AGC CT AC G 
ACACTGCTAATCGTCAATTAAATAAAGTAGCGCGTCAGATTAATAATCTA 
AAGACAACAGGAAGT CTT CCAGCTATGGAATCAAGTGAT CAAT CTTCTTC 
AT CAT C AC AAG GAC AAG G GAC T CAAT CG ACT AGT GGT GC G AC GAAT C G T C 
T AC AG C AAAAT TAT CAAAGT CAAGCT AAT G CT T CAT AC AAC CAAC AAC T T 
C AAG AT T T GAAT GAT G CT T AT G C AG AT G C AC AGG C AG AAG T AAAT AAAG C 
ACAAAAAGCATTGAAT GAT ACTGTT ATT ACAAGTGACGT AT C AGGGAC AG 
T T GT T GAAGT T AAT AGT GAT AT T GAT C C AGC T T C AAAAAC T AGT C AAGT A 
CT T GT C C AT GT AG C AACT GAAGGT AAAC T C C AAGT AC AAGG AAC GAT GAG 
T G AGT ATGATTTGGCT AAT GTTAAAAAAG AC CAGGCTGTT AAAAT AAAAT 
CTAAGGTCTATCCTGACAAGGAATGGGAAGGTAAAATTTCATATATCTCA 
AAT TAT C C AGAAG C AG AAGC AAAC AAC AAT GAC T C T AAT AAC G G C T C T AG 
T G C T GT AAAT TAT AAAT AT AAAGT AG AT AT T AC TAG C C C T C T C GAT G CAT 
T AAAAC AAG GT T T T AC C GT AT C AGT T GAAGT AG T T AAT GG AG AT AAG C AC 
CTTATTGTCCCTACAAGTTCTGT GAT AAACAAAGAT AAT AAAC ACT TTGT 
T TGGGT AT AC AAT G ATT CT AAT CGT AAAATTTCCAAAGTT GAAGT C AAAA 
T T G GT AAAG CT GAT G CT AAG AC AC AAG AAAT T T T AT C AG GT T T G AAAG C A 
GGACAAATCGTGGTTACTAATCCAAGTAAAACCTTCAAGGATGGGCAAAA 
AAT T GAT AAT AT T GAAT CAAT C GAT CT T AAC T C T AAT AAG AAAT C AG AG 

SEQ ID NO. 8506 

STRAIN M732 

TTTTTATGGGTACAATCTCAACCTAATAAGAGTGCAGTAAAAACTAATTA 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
C AG GAAAAG C T AAG G C T AAT C AAG AAC AG TAT GT GT AT T T T GAT G C T AAT 
AAAGGTAATCGAGCAACTGTTACAGTTAAAGTGGGTGATAAAATCACAGC 
TGGTCAGCAGTTAGTTCAATATGATACAACAACTGCACAAGCAGCCTACG 
AC AC T G C T AAT CGT C AAT T AAAT AAAG TAG CG C G T C AG AT T AAT AAT C T A 
AAG AC AAC AG GG AGT T T T C C AGCT AT G GAAT C AAGT GAT CAAT CTT CAT C 
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ATCATCACAAGGACAAGGGACTCAATCGACTAGTGGTGCGACGAATCGTC 
TACAGCAAAAT TAT CAAAGT C AAGCTAAT GCTT CATACAACC AACAACTT 
C AAG AT T T G AAT GAT G C T T AT G C AG AT G C AC AGG C AG AAGT AAAT AAAG C 
ACAAAAAGCATTGAATGATACTGTTATTACAAGTGACGTATCAGGGACAG 
T T GT T G AAGT T AAT AGT GAT AT T GAT C C AG CT T C AAAAACT AGT C AAGT A 
CT T GT C CAT G T AGC AAC T G AAG GT AAAC T C C AAGT AC AAGG AAC GAT GAG 
TGAGTATGATTTGGCTAATGTTAAAAAAGATCAGGCTGTTAAAATAAAAT 
CTAAGGTCT AT C CT GACAAGGAATGGGAAGGT AAAATTT CAT ATAT CT CA 
AAT TAT C C AG AAG C AGAAG C AAAC AAC AAT G AC T C T AAT AACGG CT CT AG 
T G CT GT AAAT TAT AAAT AT AAAGT AG AT AT TACT AG C C C T C T C GAT G CAT 
T AAAAC AAGGT T T T AC C GT AT C AGT T G AAGT AG T T AAT G GAG AT AAG C AC 
CTTATTGTCCCTACAAGTTCTGTGATAAACAAAGATAATAAACACTTTGT 
TTGGGT AT AC AATGATTCT AAT CGT AAAATTT CC AAAGT TG AAGT CAAAA 
T T GGT AAAG C T G AT GC T AAG AC AC AAG AAAT T T T AT C AG GT T T G AAAG C A 
GG AC AAAT C GT GGT T AC T AAT C C AAG C AAAAC T T T C AAG GAT G G G CAAAA 
AAT T GAT AAT AT T GAAT C AAT CGAT C T T AAGT C T AAT AAGAAAT C AG AG G 
TGAA 

SEQ ID NO. 8507 

STRAIN COH1 

T T T T T AT GGG T AC AAT CT C AAC CT AAT AAGAG T G C AGT AAAAAC 
TAATTACAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTC 
T T T T G AC AG G AAAAG C T AAG GC T AAT C AAG AAC AGT AT GT GT AT T T T GAT 
GCTAATAAAGGTAATCGAGCAACTGTTACAGTTAAAGTGGGTGATAAAAT 
C AC AG C T G GT C AG C AGT T AGT T C AAT AT GAT AC AAC AAC T G C AC AAG C AG 
C CT AC G AC ACT GCT AAT CGT C AAT T AAAT AAAGT AG C G C GT C AGAT T AAT 
AAT C T AAAG ACAAC AG GGAGT T T T C C AG C TAT G G AAT C AAGT GAT C AAT C 
TT CAT CAT CAT CACAAGGAC AAGGGACT CAAT CGACT AGT GGT GCGACGA 
AT CGT C TACAGCAAAAT TAT CAAAGT C AAGCTAAT GCTT CAT AC AAC C AA 
C AAC T T C AAG AT T T GAAT GAT GCT TAT G C AGAT G C AC AG GC AG AAGT AAA 
T AAAGCACAAAAAGC ATT GAATGAT ACT GT T AT T ACAAGTGACGT AT CAG 
GG AC AGT T GT T GAAGTTAATAGT GAT ATT GAT CCAGCTTC AAAAACT AGT 
C AAG TACT T GT C C AT G T AG C AAC T G AAG GT AAACT C C AAGT AC AAGG AAC 
GAT GAG T G AGT AT GAT T T G G CT AAT G T T AAAAAAG AT C AGG C T GT T AAAA 
T AAAAT C T AAGG T CT AT C C T G AC AAG GAAT GG GAAG GT AAAAT T T CAT AT 
AT CT C AAAT TAT C CAG AAG CAG AAG C AAAC AAC AAT G AC T C T AAT AACGG 
CT C T AGT G CT GT AAAT TAT AAAT AT AAAGT AG AT AT T AC TAG CCCTCTCG 
AT G C AT T AAAAC AAG G T T T T AC C GT AT C AGT T GAAG TAG T T AAT G GAG AT 
AAG C AC CT T AT T GT C C CT AC AAGT T C T GT GAT AAAC AAAGAT AAT AAAC A 
CTTTGTTTGGGTATACAATGATTCTAATCGTAAAATTTCCAAAGTTGAAG 
T C AAAAT T GGT AAAG C T GAT G C T AAGAC AC AAG AAAT T T TAT CAG GT T T G 
AAAG C AGG AC AAAT C GT GGT T AC T AAT C C AAG C AAAACT T T C AAG GAT G G 
GC AAAAAAT T GAT AAT AT T GAAT CAAT C G AT C T T AAGT C T AAT AAG AAAT 
CAGAGGTGAA 

SEQ ID NO. 8507 

STRAIN M7 81 * 

TTTTTATGGGTACAATCTCAACCTAATAAGAGTGCAGTAAAAACTAATTA 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
CAGGAAAAGCTAAGGCTAATCAAGAACAGTATGTGTATTTTGATGCTAAT 
AAAGGT AAT CGAGCAACTGTTACAGTT AAAGT GGGTGAT AAAAT CAC AGC 
T G GT C AG C AGT T AGT T CAAT AT GAT AC AAC AAC T G C AC AAG CAG C CT AC G 
ACACTGCTAATCGTCAATTAAATAAAGTAGCGCGTCAGATTAATAATCTA 
AAG AC AAC AG G G AGT T T T C CAG C T AT G GAAT C AAGT GAT CAAT C T T CAT C 
AT CAT CAC AAG G AC AAG G G ACT CAAT C G AC T AGT GG T G CG ACG AAT C GT C 
T AC AG C AAAAT TAT CAAAGT C AAG CT AAT GCTT C AT AC AAC C AAC AACT T 
C AAG AT T T GAAT GAT G CT T AT GC AGAT G CAC AGG C AGAAG T AAAT AAAG C 
AC AAAAAG CAT T GAAT GAT AC T GT TAT T AC AAGT GAC GT AT C AG GG AC AG 
T T GT T G AAGT T AAT AGT GAT AT T GAT C C AG C T T C AAAAAC TAG T C AAGT A 
C TT G T C C AT GT AG C AACT GAAG G T AAACT C C AAGT AC AAG G AAC GAT GAG 
TGAGTATGATTTGGCTAATGTTAAAAAAGATCAGGCTGTTAAAAT AAAAT 
C T AAG G T CT AT C C T GAC AAGG AAT G GG AAG GT AAAAT T T CAT AT AT C T C A 
AAT TAT C CAG AAG CAG AAG C AAAC AAC AAT GAC T CT AAT AAC G G C T C T AG 
TGCTGTAAATTATAAATATAAAGTAGATATTACTAGCCCTCTCGATGCAT 
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TAAAACAAGGTTTTACCGTATCAGTTGAAGTAGTTAATGGAGATAAGCAC 
CTT ATTGT CC CTACAAGT T CTGTGAT AAACAAAGATAAT AAAC ACTT T GT 
TTGGGTATACAATGATTCTAATCGTAAAATTTCCAAAGTTGAAGTCAAAA 
T T G GT AAAG C T GAT G C T AAG AC AC AAG AAAT T T TAT C AGGT T T GAAAG C A 
GGACAAATCGTGGTTACTAATCCAAGCAAAACTTTCAAGGATGGGCAAAA 
AATTGAT AATATTGAAT CAAT CGAT CTT AAGT CTAAT AAGAAAT CAGAGG 
TGAA 

SEQ ID NO. 8508 

STRAIN CJB110 

T TT T T AT G G GT AC AAT CT C AAC CT AAT AAGAGT GC AGT AAAAACT AAC T A 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
CAGGAAAAGCTAAGGCTAATCAAGAACAGTATGTGTATTTTGATGCTAAT 
AAAGGTAATCGAGCAACTGTCACAGTTAAAGTGGGTGATAAAATCACAGC 
TGGTCAGCAGTTAGTTCAATATGATACAACAACTGCACAAGCAGCCTACG 
AC ACT G C T AAT C GT CAAT T AAAT AAAG TAG C G C GT C AG AT T AAT AAT CT A 
AAGAC AAC AGGAAGT CTT CCAGCTATGGAATTAAGT GAT CAAT CTT CTT C 
AT CAT C AC AAG GAC AAG G G AC T CAAT C GACT AG T GGT GC GAC GAAT CGT C 
TAG AG C AAAAT TAT C AAAGT C AAG C T AAT G C T T CAT AC AAC C AAC AACT T 
C AAGAT T T GAAT GAT G CT T AT G C AGAT G C AC AG GC AG AAGT AAAT AAAG C 
AC AAAAAG CAT T GAATG AT AC T G T TAT T AC AAGT GAC GT AT C AG GGAC AG 
T TGT T G AAGT T AAT AGT GAT AT T GAT C C AG CTT C AAAAACT AGT C AAGT A 
CTT GT C C AT GT AGC AACT GAAG GT AAAC T C C AAGT AC AAGGAAC GAT GAG 
TGAGTATGATTTGGCTAATGTTAAAAAAGACCAGGCTGTTAAAATAAAAT 
C T AAG GT CT AT C CT GAC AAG G AAT GG GAAGG T AAAAT T T CAT AT AT C T C A 
AAT TAT C C AG AAG C AG AAGC AAAC AAC AAT G ACTC T AAT AACG G C T CT AG 
TGCTGTAAATTAT AAAT AT AAAGT AGAT ATT ACT AGCCCTCT CGAT G CAT 
TAAAACAAGGTTTTACCGTATCAGTTGAAGTAGTTAATGGAGATAAGCAC 
CT T AT TGTCCCT AC AAGT T CTGTGAT AAAC AAAG AT AAT AAAC ACTT TGT 
T T G G GT AT AC AAT GAT T C T AAT C GT AAAAT T T C C AAAGT TGAAGT C AAAA 
T T GGT AAAG C T GAT G CT AAG AC AC AAG AAAT T T TAT C AG GT T T GAAAG C A 
GGACAAATCGTGGTTACTAATCCAAGTAAAACCTTCAAGGATGGGCAAAA 
AAT T GAT AAT AT T GAAT CAAT C GAT C T T AAC T C T AAT AAG AAAT C AG AG G 
TGA 

SEQ ID NO. 8509 

STRAIN 1169NT 

T T T T T AT GG GT AC AAT CT C AAC C T AAT AAG AG T G C AG T AAAAAC T 
AACTACAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCT 
T TT G AC AGGAAAAGC T AAG G C T AAT C AAG AAC AGT AT G T G TAT T T T GAT G 
CTAAT AAAGGT AAT C GAG C AACT GT C AC AGT T AAAG T GG G T GAT AAAAT C 
AC AG CT G GT C AG C AGT TAG T T CAAT AT GAT AC AAC AAC T G C AC AAG C AG C 
CTACGACACTGCTAATCGTCAATTAAATAAAGTAGCGCGTCAGATTAATA 
AT CT AAAGAC AAC AG G AAGT CT T C C AG C TAT GG AAT C AAGT GAT CAAT CT 
T CT T CAT CAT C AC AAGG AC AAGG G AC T CAAT CGAC T AGT GGT GC G AC G AA 
T C GT CT AC AG C AAAAT TAT C AAAGT C AAG C T AAT G C T T CAT AC AAC C AAC 
AACTTCAAGATTTGAATGATGCTTATGCAGATGCACAGGCAGAAGTAAAT 
AAAGCACAAAAAGCATTGAATGATACTGTTATTACAAGTGACGTATCAGG 
GAC AGT T G T T GAAGT T AAT AGT G AT AT T GAT C C AG CT T C AAAAAC T AGT C 
AAGT AC T T GT C CAT GT AGC AAC T G AAG GT AAAC T C C AAGT AC AAGG AAC G 
AT G AGT GAG TAT GAT T T G G C T AAT GT T AAAAAAG AC C AG G C T GT T AAAAT 
AAAAT CT AAGG T C T AT C C T GAC AAGG AATG GGAAGGT AAAAT T T CAT AT A 
T C T C AAAT TAT C C AGAAG CAGAAG C AAAC AAC AAT GAC T C T AAT AAC GG C 
T C T AGT G C T GT AAAT TAT AAAT AT AAAG TAG AT AT T AC TAG C CCT C T CG A 
T G CAT T AAAAC AAG GT T T T AC C GT AT C AG T T GAAG TAG T T AAT G GAG AT A 
AG C AC CT T AT TGT CCC T AC AAG T T CTGT GAT AAAC AAAG AT AAT AAAC AC 
TTTGTTTGGGTATACAATGATTCTAATCGTAAAATTTCCAAAGTTGAAGT 
C AAAAT T G GT AAAG CT G AT G C T AAG AC AC AAG AAAT T T TAT C AG GT T T G A 
AAG C AGG AC AAAT C GT GG T T AC T AAT C C AAGT AAAACC T T C AAG GAT G GG 
CAAAAAATTGATAAT ATT GAAT CAAT CGAT CTT AACT CTAAT AAGAAAT C 
AGAGGTGAA 

SEQ ID NO. 8510 

STRAIN JM9130013 
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T T T T TAT G G GT AC AAT CT C AAC C T AAT AAGAGT G C AGT AAAAACT AACT A 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
C AGGAAAAGCTAAGGCTAAT CAAGAAC AGT AT GTGTATTTT GATGCT AAT 
AAAGGTAATCGAGCAACTGTTACAGTTAAAGTGGGTGATAAAATCACAGC 
T GGT C AGC AGTT AGT T C AAT AT G AT AC AAC AACT GC AC AAG C AG C C T AC G 
ACACTGCTAATCGTCAATTAAATAAAGTAGCGCGTCAGATTAATAATCTA 
AAGACAACAGGAAGT CTT CCAGCTATGGAAT CAAGT GAT CAAT CT T CATC 
AT CAT CAC AAGG AC AAG G G GC T C AAT CGACT AGT G GT GCG ACG AAT CGT C 
T AC AG C AAAAT TAT C AAAGT C AAG CT AAT GCT T CAT AC AAC C AAC AACT T 
C AAG AT T T G AAT GAT G CT T AT GC AG AT GC AC AG G C AGAAGT AAAT AAAG C 
AC AAAAAG C AT T GAAT G AT AC T GT T AT T AC AAGT G AC GT AT C AG GG AC AG 
T TGTTGAAGTT AAT AGTGAT ATTGAT CCAGCT T CAAAAACT AGT CAAGT A 
C T T GT C C AT GT AG C AAC T GAG G GT AAAC T C C AAG TAG AAGG AAC GAT GAG 
T GAGTATGATTTGGCT AAT GTT AAAAAAGACCAGT CT GTT AAAAT AAAAT 
CT AAGGT CT AT C CT G AC AAG GAAT GG GAAGGT AAAAT T T CAT AT AT C T C A 
AAT TAT C C AG AAG C AG AAG C AAAC AAC AAT G AC T CT AAT AACGG CT C T AG 
TGCTGTAAATTATAAATATAAAGTAGATATTACTAGCCCTCTCGATGCAT 
T AAAAC AAGG T T T T ACT GT AT C AG TT G AAGT AGT T AAT G GAGAT AAGC AC 
CTT ATTGTTCCT AC AAGTTCTGTG AC AAAC AAAG ATAAT AAAC ACT TTGT 
T T G GGT AT AC AAT GAT T CT AAT C G T AAAAT T T C C AAAGT T G AAGT C AAAA 
T T G GT AAAGC T GAT GCT AAGAC AC AAGAAAT T T T AT C AGGT T T GAAAG C A 
GGACAAATCGTGGTT ACT AAT CCAAGCAAAACTTTCAAGGATGGGC AAAA 
AATTGATAAT AT T GAAT CAAT AG AT CTT AAGT CT AAT AAGAAAT C AG AGG 
TGAAA 

SEQ ID NO. 8511 
STRAIN 2 603 frame: 1 

MSKRQNLGISKKGAIISGLSVALIVVIGGFLWVQSQPNKSAVKTNYKVFNVREGSVSSST 
LLTGKAPCANQEQYVYFDANKGNRATVTVKVGDKITAGQQLVQYDTTTAQAAYDTANRQLN 
KVARQ INNLKT T GS LPAME S S DQS S S SSQGQGTQSTSGATNRLQQNYQSQANAS YNQQLQ 
DLNDAYADAQAEVNKAQKALNDTVITSDVSGTWEVNSDIDPASKTSQVLVHVATEGKLQ 
VQGTMSEYDLANVKKDQAVKIKSKVYPDKEWEGKI S YI SNYPEAEANNNDSNNGS S AVNY 
KYKVDITSPLDALKQGFTVSVEVVNGDKHLIVPTSSVINKDNKHFVWVYNDSNRKISKVE 
VKIGKADAKTQEILSGLKAGQIVVTNPSKTFKDGQKIDNIESIDLNSNKKSEVK 

SEQ ID NO. 8512 

STRAIN 090 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMELSDQSSSSSQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
S GT WE VN S D I D PAS KT S Q VLVHVATEGKLQ VQGTMS E YDL ANVKKDQAVKI K S KV Y P DK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIWTNPSK 
TFKDGQKIDNIESIDLNSNKKSE 

SEQ ID NO. 8513 

STRAIN A909 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMES S DQSSSSSQ 
GQGAQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQSVKIKSKVYPDK 
EWEGKISYI SNYPEAEANNNDSNNGS SAVNYKYKVDITSPLDALKQGFTVSVEVVNGDKH 
LIVPTSSVTNKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
TFKDGQKIDNIESIDLKSNKKSEVK 

SEQ ID NO. 8514 

STRAIN H36B frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMESSDQSSSSSQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTVVEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEVVNGDKH 
LIVPTSSVTNKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
AFKDGQKIDNIESIDLKSNKKSEV 
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SEQ ID NO. 8515 

STRAIN 18RS21 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMESSDQSSSSSQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISBCVEVKIGKADAKTQEILSGLKAGQIWTNPSK 
TFKDGQKIDNIESIDLNSNKKSE 

SEQ ID NO. 8516 

STRAIN M732 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKI T AGQQLVQ Y DTT T AQAAYDT ANRQLNKVARQ INN LKTT G S FPAMES SDQS S S S SQ 
GQGT Q S T S G ATNRLQQN YQ S QAN AS YNQQLQ DLN D AY ADAQ AE VNKAQKALN DT VI T S D V 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIWTNPSK 
T FKDGQKI DN IE S I DLKSNKKS E V 

SEQ ID NO. 8517 

STRAIN COH1 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGS FPAMES SDQSSSSSQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
TFKDGQKIDNIESIDLKSNKKSEV 

SEQ ID NO. 8518 

STRAIN M7 81 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGS FPAMES S DQS S S S SQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYI SN YPEAEANNNDSNNGS S AVN YKYKVDIT S PLDALKQGFTVS VE WNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
T FKDGQKI DNIES I DLKSNKKSEV 

SEQ ID NO. 8519 

STRAIN M7 81 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGS FPAMES S DQS S S S SQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSS AVN YKYKVDIT SPLDALKQGFTVSVEVVNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIWTNPSK 
TFKDGQKIDNIESIDLKSNKKSEV 

SEQ ID NO. 8520 

STRAIN CJB110 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTT AQAAYDT ANRQLNKVARQINNLKTTGS LPAMELS DQS S S S SQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITS PLDALKQGFTVS VEVVNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
TFKDGQKIDNIESIDLNSNKKSEV 

SEQ ID NO. 8521 

STRAIN 1169NT frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
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VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMESSDQSSSSSQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSPCVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIWTNPSK 
TFKDGQKIDNIESIDLNSNKKSEV 

SEQ ID NO. 8522 

STRAIN JM9130013 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMES S DQS S S S SQ 
GQGAQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQSVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVTNKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIWTNPSK 
T FKDGQKI DNI ES I DLKSNKKSE VK 

SEQ ID NO. 8601 
STRAIN 2603 

atgaaaaaaattggaattattgtcctcacactactgaccttctttttggtatcttgcgga 
caacaaactaaacaagaaagcactaaaacaactatttctaaaatgcctaaaattgaaggc 
ttcacctattatggaaaaattcctgaaaatccgaaaaaagtaattaattttacatattct 
tacactgggtatttattaaaactaggtgttaatgtttcaagttacagtttagacttagaa 
aaagatagccccgtttttggtaaacaactgaaagaagctaaaaaattaactgctgatgat 
acagaagctattgccgcacaaaaacctgatttaatcatggttttcgatcaagatccaaac 
atcaatactctgaaaaaaattgcaccaactttagttattaaatatggtgcacaaaattat 
ttagatatgatgccagccttggggaaagtattcggtaaagaaaaagaagctaatcagtgg 
gttagccaatggaaaactaaaactctcgctgtcaaaaaagatttacaccatatcttaaag 
•cctaacactacttttactattatggatttttatgataaaaatatctatttatatggtaat 
aattttggacgcggtggagaactaatctatgattcactaggttatgctgccccagaaaaa 
gtcaaaaaagatgtctttaaaaaagggtggtttaccgtttcgcaagaagcaatcggtgat 
tacgttggagattatgcccttgttaatataaacaaaacgactaaaaaagcagcttcatca 
cttaaagaaagtgatgtctggaagaatttaccagctgtcaaaaaagggcacatcatagaa 
agtaactacgacgtgttttatttctctgaccctctatctttagaagctcaattaaaatca 
tttacaaaggctatcaaagaaaatacaaat 

SEQ ID NO. 8602 

STRAIN 090 

G AAG G C T T C AC C T AT TAT GG AAAAAT T C CT GAAAAT C C GAAAAAAGT AAT 
TAATTTTACATATTCTTACACTGGGTATTTATTAAAACTAGGTGTTAATG 
TTTCAAGTTACAGTTTAGACTTAGAAAAAGATAGCCCCGTTTTTGGTAAg 
C AACTGAAAGAAGCT AAAAAAT T AACT GCT GATGAT AC AGAAGCT ATTGC 
CG C AC AAAAAC CT GAT T T AAT C AT GGT T T T C GAT C AAG AT C C AAAC AT C A 
AT AC T CT G AAAAAAAT T G C AC C AAC T T T AGT TAT T AAAt AT GGT G C AC AA 
AAT TAT T TAG AT AT GAT G C C AGC C T T G GGG AAAGT AT T C GGT AAAGAAAA 
AGAAGCTAATCAGTGGGTTAGCCAATGGAAAACTAAAACTCTCGCTGCCA 
AAAAAG AT T T AC AC CAT AT C T T AAAG C C T AAC ACT AC T T T T AC T AT TAT G 
GATTTTTATGATAAAAATATCTATTTATATGGTAATAATTTTGGACGCGG 
t G GAG AACT AAT CT AT GAT T C ACT AGG T TAT G CT GC C C C Ag AAAAAGT C A 
AAAAAgATGTcTTTAAAAAAGGGTGGTTTACCGTTTCgCAAGAAGCAATC 
G G t GAT T ACG T T G GAG AT TAT GCCCTTGTT AAT AT AAAC AAAAC G ACT AA 
AAAAG C AGCT T C a t c AC T T AAAG AAAG T GAT GT C T GG AAG AAT T T AC C AG 
CTGTCaAAAAAGGGCACATCATAGAAAGTAacTACGACGTGTTTTATTTC 
TCTGACCCTCTATCTTTAGAAGCTCAATTAAAATCATTTACAAA 

SEQ ID NO. 8603 

STRAIN A90 9 

GAAGGCTTCACCTATTATGGAAAAATTCCTG 

AAAATCCGAAAAAAGTAATTAATTTTACATATTCTTACACTGGATATTTA 
T T AAAAC T AGG AGT T AAT GT T T C AAG T T AC AGT T TAG AC T T AG AAAAAGA 
TAgCCCCGTTTTTGGTAAaCAACTGAAAGGAGCTAAAAAATTAACTGCTG 
AT GAT AC AGAAGCT AT TGCCGCACAAAAACCT GAT TTAaTCATGGTTTTT 
GAT CAAGAT CC AAAC AT C AAT ACT CT GAAAAAAATTGC ACCAACTTTAGT 
TAT T AAAT AT GGT G C AC AAAAT T AT T T Ag AT a T GAT G C C AG C T T T G GG G A 
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AAGT AT T C GGT AAAG AAAAAG AAG C T AAT C AGT GG GT T AG C C Aa T G G AAA 
ACTAAAACTCTCGCTGCCAAAAAAGATTTACACCATATCTTAAAACCTAA 
CACT ACT T T TACCAT TAT GGAT T T TT ATGAT AAAAAT ATCT AT T TAT AT G 
GT AAT AAT T T T G GAG G CG GT GG AG AAC T AAT C TAT GAT T C AC TAG G T TAT 
GCTGCCCCAGAAAAAGT CAAAAAAGATGTCTTTAAAAAAGGGT GGT TT AC 
CGTTTCGCAAGAAGCAATCGGTgATTACGTTGGAGATTATGCCCTTGTTA 
AT AT AAAC AAAAC G AC T AAAAAAG C AG CT T CAT C ACT T AAAGAAAG T GAT 
GTCTGGAAGAATTTACCAGCTGTCAAAAAAGGGCACATCATAGAAAGTAA 
CTACGACGTGTTTTATTTCTCTGACCCTcTATCTTTAGAAGCTCAATTAA 
AAT CAT T TAG AAA 

SEQ ID NO. 8604 

STRAIN H3 6B 

G AAGG C T T C AC C TAT TAT G G AAAA 

AT T C CT G AAAAT C C G AAAAAAGT AAT T AAT T T T AC AT AT T C T T AC AC T GG 
ATATTTATTAAAACTAGGAGTTAATGTTTCAAGTTACAGTTTAGACTTAG 
AAAAAGATAgCCCCGTTTTTGGTAAgCAACTGAAAGGAGCTAAAAAATTA 
ACT G CT GAT GAT AC AG AAG C T AT T G CC G C AC AAAAAC CT GAT T T Aa T CAT 
GGT T T T T GAT C AAgAT C C AAAC AT C AAT AC T CT GAAAAAAAT T G C AC C AA 
CTTTAGTTATTAAATATGGTGCACAAAATTATTTAgATaTgATGCCAGCT 
TTGGGGAaAGTATTCGGTAAAGAAAAAGAAGCTAATCAGTGGGTTAGCCA 
ATGGAAAACTAAAACTCTCGCTGCCAAAAAAGATTTACACCATATCTTAA 
GGC CT a AC Ac TACT T TT ACT AT T AT AGA t T T T TAT GAT AAAAAT AT CT AT 
T TAT AT GGT AAT AAT T T T GG AC G CG G t G G Ag AAC T AAT C T AT GAT t C AC T 
AGGT T AT G CT G C C C C Ag AAAAAGT C AAAAAAg AT GT C T T T AAAAAAGG GT 
GGTTTACCGTTTCgCAAGAAGCAATCGGTgATTACGTTGGAGATTATGCC 
C T T GT T AAT AT AAAC AAAAC G AC T AAAAAAG C AG C T T C a T C AC T T AAAGA 
AAGT GAT GT T T G GAAGAAT T T AC C AGC T GT C AAAAAAGG GC AC AT CAT AG 
AAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGAAGCT 
C AATT AAAAT CATTT AC AAA 

SEQ ID NO. 8605 

STRAIN 18RS21 
GAAGGCTTCACCTATTATGGA 

AAAATT CCT GAAAAT CCGAAAAAAGT AAT T AATTTT ACAT ATT CT TACAC 
TGGGTATTT ATT AAAACT AGGT GTT AAT GTTTCAAGTTACAGTTTAGACT 
T AGAAAAAGAT AGC CCCGTT TT T GGT AAAC AAC TG AAAG AAG CT AAAAAA 
TTAACTGCTGATGATACAGAAGCTATTGCCGCACAAAAACCTGATTTAAT 
CAT GGT T T T C GAT C AAG AT C C AAAC AT C AAT AC T C T GAAAAAAAT T G C AC 
CAACTTT AGT T ATT AAATATGGTGC AC AAAATT ATTTAg AT aTGATGCCA 
GC CT TGGGGAAAGT ATT CGGT AAAG AAAAAgAAGCTAAT CAGTGGGT TAG 
CCAATGGAAAACT AAAACT CTCGCTGTC AAAAAAG ATTT AC ACC AT ATCT 
T AAAG C C T AAC AC TACT T T T AC T AT TAT GGAT T T T TAT GAT AAAAAT AT C 
TATTTATATGGTAATAATTTTGGACGCGGTGGAGAACTAATCTATGATTC 
ACT AGGTT AT GCTGCCCCAg AAAAAGT C AAAAAAg ATGTCTTT AAAAAAG 
GGTGGTTTACCGTTTCGCAAGAAGCAATCGGTGATTACGTTGGAGATTAT 
GCC CTT GTT AAT ATAAACAAAACgACT AAAAAAGCAGCTTC AT C ACT T AA 
AG AAAGT G AT GT C T G GAAGAAT T T AC C AG C T GT C AAAAAAG G G C AC AT C A 
TAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGAA 
G C T C AAT T AAAAT CAT T T AC AAA 

SEQ ID NO. 8606 

STRAIN M732 

G AAGG C T T C AC C T AT TAT G G 

AAAAAT T CCT GAAAAT CCGAAAAAAGT AATT AATTT T AC AT ATT CTT ACA 
CTGGGT ATTT ATT AAAACT AGGTGTT AAT GTTTCAAGTTACAGTTTAGAC 
TT AGAAAAAGAT AGC CCCGTT TTTGGTAAGCAACTGAAAGAAGCTAAAAA 
AT T AAC T G CT GAT GAT AC AG AAG C T AT T G C C G C AC AAAAAC C T GAT T T AA 
TCATGGTTTTCGATCAAGATCCAAACATCAATACTCTGAAAAAAATTGCA 
CC AAC T T TAG T TAT T AAAT AT G GT G C AC AAAAT TAT T T Ag AT AT GAT GCC 
AGCCTTGGGGAAAGTATTCGGTAAAGAAAAAGAAGCTAATCAGtGGGTTA 
GCC AATGG AAAACT AAAACT CTCGCTGCCAAAAAAG ATTT AC AC CAT AT C 
T T AAAG C C T AAC AC TAG T T T T AC T AT T AT G GAT T T T TAT GAT AAAAAT AT 
CTATTTATATGGTAATAATTTTGGACgCGGtGGAgAACTAATCTATGATT 
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CACTAGGTTATGCTGCCCCAGAAAAAGTCAAAAAAGATGTCTTTAAAAAA 
GGGTGGTTTACCGTTTCGCAAGAAGCAATCGGTGATTACGTTGGAGATTA 
TGCCCT T GTT AATAT AAAC AAAACGACT AAAAAAGCAGCT T CAT CACTT A 

AAGAAAGTGATGTCTGGAAGAAtTTACCAGCTGTCAAAAAAGGGCACATC 
ATAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGA 
AGCT CAATTAAAAT CAT TTACAAA 

SEQ ID NO. 8607 

STRAIN COH1 

GAAGG CT T C AC CT AT T AT G 

GAAAAATTCCTGAAAATCCGAAAAAAGTAATTAATTTTACATATTCTTAC 
ACTGGGTATTTATTAAAACTAGGTGTTAATGTTTCAAGTTACAGTTTAgA 
CTTAGAAAAAGATAGCCCCGTTTTTGGTAAGCAACTGAAAGAAGCTAAAA 
AAT T AACT G CT GAT G AT AC AGAAG C TAT T G C CG C AC AAAAAC C T GAT T T A 
AT CAT G GT T T T C GAT C AAGAT C C AAAC AT C AAT ACT C T G AAAAAAAT T GC 
ACCAACTTTAGTTATTAAATATGGTGCACAAAATTATTTAgATATGATGC 
CAGCCTTGGGGAAAGTaTTcGGTAAAGAAAAAGAAGCTAATCAGTGGGTT 
AG C C AAT G G AAAAC T AAAACT C T C G C T GC C AAAAAAG AT T T AC AC CAT AT 
C T T AAAG C CT AAC AC T AC T T T TAG TAT T AT GG AT T T T T AT G AT AAAAAT A 
T CT AT T TAT AT GGT AAT AAT T T T G G AC G C GGT G G AG AAC T AAT CT AT GAT 
TCACTAGGTTATGCTGCCCCAGAAAAAGTCAAAAAAGATGTCTTTAAAAA 
AGGGTGGTTTACCGTTTCGCAAGAAGCAATCGGTGATTACGTTGGAGATT 
AT G C C C T T GT T AAT AT AAAC AAAAC GAC T AAAAAAG C AG C T T CAT C AC T T 
AAAG AAAGT GAT GT C T G G AAG AAT T T AC C AG CT GT C AAAAAAG G G C AC AT 
CATAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAG 
AAGCT CAATTAAAAT CATTT AC AAA 

SEQ ID NO. 8608 

STRAIN M781 
GAAGGCTTCACCTATTATGG 

AAAAATTCCTGAAAATCCGAAAAAAGTAATTAATTTTACATATTCTTACA 
CTGGGTATTTATTAAAACTAGGTGTTAATGTTTCAAGTTACAGTTTAGAC 
T T Ag AAAAAG AT AG CCCCGTTTTT GG T AAG C AACT G AAAG AAGC T AAAAA< 
AT T AAC T G C T GAT GAT AC AG AAG C T AT T G C C G C AC AAAAAC C T GAT T T AA 
T CAT GGT T T T C GAT C AAG AT C C AAAC AT C AAT AC T C T GAAAAAAAT T G C A 
CC AACTTTAGT T ATT AAATATGGT GCACAAAATT AT TT AgATAT GAT G C C 
AGCCTTGGGGAAAGTATTCGGtAAAGAAAAAGAAGCTAATCAGTGGGTTA 
G C C AAT G G AAAAC T AAAAC T CTCGCTGC C AAAAAAG AT T T AC AC CAT AT C 
TTAAAGCCTAACACTACTTTTACTATTATGGATTTTT AT GAT AAAAAT AT 
CTATTTATATGGTAATAATTTTGGACGCGGTGGAGAACTAATCTATGATT 
CACTAGGTTATGCTGCCCCAGAAAAAGTCAAAAAAGATGTCTTTAAAAAA 
GGGT GGT T T AC C GT T T C G C AAGAAG C AAT C G GT G AT T AC GT T G GAG AT T A 
T G C C CT T GT T AAT AT AAAC AAAAC G AC T AAAAAAG C AGC T T CAT C AC T T A 
AAG AAAGT GAT GT C T G G AAG AAT T T ACC AG C T G T C AAAAAAG GG C AC AT C 
ATAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGA 
AGCT CAATTAAAAT CAT TTACAAA 

SEQ ID NO. 8609 

STRAIN CJB110 
GAAGGCTTCACCTATTATGGA 

AAAATTCCTGAAAATCCGAAAAAAGTAATTAATTTTACATATTCTTACAC 
TGGGTATTTATTAAAACTAGGTGTTAATGTTTCAAGTTACAGTTTAGACT 
TAG AAAAAG AT AG CCCCGTTTTT G GT AAGC AAC T G AAAG AAG CTAAAAAA 
T T AACT G C T GAT GAT AC AG AAG CT AT T G C C GC AC AAAAAC C T GAT T T AAT 
CAT G GT T T T C GAT C AAG AT C C AAAC AT C AAT ACT CT GAAAAAAAT T GC AC 
C AACT T T AGT T AT T AAAT AT GGT G C AC AAAAT T AT T T Ag AT AT GAT G C C A 
GCCTTGGGGAAAGTATTCGGTAAAGAAAAAGAAGCTAATCAGTGGGTTAG 
CCAATGG AAAACT AAAACT CTCGCTGCC AAAAAAG AT TT AC AC CAT AT CT 
TAAAGCCTAACACTACTTTTACTATTATGGATTTTTATGATAAAAATATC 
T AT T TAT AT GGT AAT AAT T T T GG AC G C GG t G GAG AAC T AAT C T AT GAT T C 
ACTAGGTTATGCTGCCCCAGAAAAAGTCAAAAAAGATGTCTTTAAAAAAG 
GGTGGTTTACCGTTTCGCAAGAAGCAATCGGTGATTACGTTGGAGATTAT 
GCCCTTGTTAATATAAACAAAACGACTAAAAAAGCAGCTTCATCACTTAA 
AGAAAG T G AT GT C T GG AAG AAT T T AC C AG C T GT C AAAAAAG G GC AC AT C A 
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TAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGAA 
GCTCAATTAAAATCATTTACAAA 

SEQ ID NO. 8610 

STRAIN 1169NT 

G AAGG C T T C AC C T AT T AT GGAAAAAT T 

CCTGAAAAT CCG AAAAAAGT AATT AATTT T AC AT ATT CTT ACACTGGGTA 
T T T AT T AAAAC T AGGT GT T AAT GT T T C AAGT T AC AGT T TAG ACT TAG AAA 
AAGATAGCC CCGT T TTTGGT AAGC AACT GAAAGAAGCTAAAAAATTAACT 
G C T GAT G AT AC AG AAG CT AT T GC C g c AC AAa a ACCT GAT T T AAT CAT G GT 
TTTCGATC AAGAT C CAAACAT CAATACT CT GAAAAAAATT GC ACC AACT T 
T AGT T AT T AAAT AT G GT G C AC AAAAT TAT T T Ag AT AT GAT G C C AGC C T T G 
GGG AAAGT AT T CGG T AAAG AAAAAGa a G CT AAT C AGT G G G T TAG C C AAT G 
GAAAACTAAAACTCTCGCTGCCAAAAAAGATTTACACCATATCTTAAAGC 
CTAACACTACTTTTACTATTATGGATTTTTATGATAAAAATATCTATTTA 
TAT GGT AAT AAT T T T G G AC G C GGT GG AG AAC T AAT C TAT GAT T C AC T AGG 
TTATGCTGCCCCAgAAAAAGTCAAAAAAGATGTCTTTAAAAAAGGGTGGT 
TTACCGTTTCgCAAGAAGCAATCGGTGATTACGTTGGAGATTATGCCCTT 
GTTAATATAAACAAAACGACTAAAAAAGCAGCTTCATCACTTAAAGAAAG 
T GAT GT CT GG AAG AAT T T AC C AG C T GT C AAAAAAG G G C AC AT CAT AG AAA 
GTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGAAGCTCAA 
T T AAAAT CAT T T AC AAA 

SEQ ID NO. 8611 

STRAIN JM9130013 

G AAG G CT T C AC C T ATT AT G 

GAAAAAT T C CT GAAAAT C CGAAAAAAGT AAT T AATTTT ACAT AT T CTT AC 
AC T GG AT AT T T AT T AAAAC TAG G AGT T AAT GT T T C AAG T T AC AG T T T AGA 
CTTAGAAAAAGATAGCCCCGTTTTTGGTAAGCAACTGAAAGGAGCTAAAA 
AAT T AACT GCTGAT GAT ACAGAAGCTAT TGCCGCACAAAAAC CT G ATT T A 
AT CATGGT T T TTGAT CAAGATC CAAACAT CAATACT CT GAAAAAAATT G C 
AC C AACT T T AGTT AT T AAAT ATGGTGC AC AAAATTAT T T AgAT AT GAT G C 
CAGCTTTGGGGAAAGTATTCGGTAAAGAAAAAGAAGCTAATCAGTGGGTT 
AGCCAATGGAAAACT AAAACT CT CGCT GC CAAAAAAGAT T T ACAC CAT AT 
CTTAAAACCTAACACTACTTTTACCATTATGGATTTTTATGATAAAAATA 
TCTATTTATATGGTAATAATTTTGGACGCGGtGGAGAACTAATCTATGAT 
T C AC T AG GT T AT G CT G C C C C Ag AAAAAGT CAAAAAAGAT G T C T T T AAAAA 
AGGGT GGT T T AC C GT T T C g C AAGAAG C AAT C GGT G AT T AC GT T G GAG AT T 
AT GC CCT TGTT AAT AT AAACAAAACGACT AAAAAAGC AGCTT C AT CACT T 
AAAG AAAGT GAT GT C T GG AAG AAT T T AC C AG C T GT C AAAAAAG G G C AC AT 
CATAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAG 
AAG C T C AAT T AAAAT CAT T T AC AAA 

SEQ ID NO. 8612 
STRAIN 2 603 frame: 1 

MKKIGIIVLTLLTFFLVSCGQQTKQESTKTTISKMPKIEGFTYYGKIPENPKKVINFTYS 
YTGYLLKLGVNVSSYSLDLEKDSPVFGKQLKEAKKLTADDTEAIAAQKPDLIMVFDQDPN 
INTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEANQWVSQWKTKTLAVKKDLHHILK 
PNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAPEKVKKDVFKKGWFTVSQEAIGD 
YVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHIIESNYDVFYFSDPLSLEAQLKS 
FTKAIKENTN 

SEQ ID NO. 8613 

STRAIN 0 90 frame: 1 

EGFTYYGKIPENPKKVINFTYSYTGYLLKLGVNVSSYSLDLEKDSPVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8614 

STRAIN A909 frame: 1 

EGFTYYGKIPENPKKVINFTYSYTGYLLKLGVNVSSYSLDLEKDSPVFGKQLKGAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
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QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8615 

STRAIN H3 6B frame: 1 

EGFT YYGKI PENPKKVINFT YS YTGYLLKLGVNVS S YS LDLEKDS PVFGKQLKGAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILRPNTTFTIIDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8616 

STRAIN 18RS21 frame: 1 

EGFTYYGKIPENPKKVINFTYSYTGYLLKLGVNVS SYS LDLEKDS PVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAVKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLS LEAQLKS FT 

SEQ ID NO. 8617 

STRAIN M7 32 frame: 1 

EGFTYYGKIPENPKKVINFTYS YTGYLLKLGVNVS SYSLDLEKDS PVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8618 

STRAIN COH1 frame: 1 

EGFTYYGKIPENPKKVINFTYSYTGYLLKLGVNVSSYSLDLEKDS PVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8619 

STRAIN M7 81 frame: 1 

EGFTYYGKIPENPKKVINFTYS YTGYLLKLGVNVS SYSLDLEKDS PVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8620 

STRAIN CJB110 frame: 1 

EGFTYYGKIPENPKKVINFTYSYTGYLLKLGVNVS SYSLDLEKDS PVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8621 

STRAIN 1169NT frame: 1 

EGFTYYGKIPENPKKVINFTYS YTGYLLKLGVNVS SYSLDLEKDS PVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8622 

STRAIN JM9130013 frame: 1 

EGFT YYGKI PEN PKKVINFTYSYTGYLLKLGVNVSSYSLDLEKDS PVFGKQLKGAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
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QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
E KVKK D V FKKGW FT VS QE A I G D Y VG D Y AL VN INKT T KKAAS S LKE S D VWKN L P AVKKGH I 
IESNYDVFYFS DPLSLEAQLKS FT 

SEQ ID NO. 8701 
STRAIN 2603 

ATGAAATTATCGAAGAAGTTATTGTTTTCGGCTGCTGTT 

TTAACAATGGTGGCGGGGTCAACTGTTGAACCAGTAGCTCAGTTTGCGACTGGAATGAGT • 
ATTGTAAGAGCTGCAGAAGTGTCACAAGAACGCCCAGCGAAAACAACAGTAAATATCTAT 
AAATTACAAGCTGATAGTTATAAATCGGAAATTACTTCTAATGGTGGTATCGAGAATAAA 
GACGGCGAAGTAATATCTAACTATGCTAAACTTGGTGACAATGTAAAAGGTTTGCAAGGT 
GT AC AG T T T AAACGT T AT AAAGT C AAG AC G GAT AT TT C T GT TG AT G AAT T G AAAAAAT T G 
ACAACAGTTGAAGCAGCAGATGCAAAAGTTGGAACGATTCTTGAAGAAGGTGTCAGTCTA 
CCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGATTCAAAAAGTAATGTG 
AGATACTTGTATGTAGAAGATTTAAAGAATTCACCTTCAAACATTACCAAAGCTTATGCT 
GTACCGTTTGTGTTGGAATTACCAGTTGCTAACTCTACAGGTACAGGTTTCCTTTCTGAA 
ATTAATATTTACCCTAAAAACGTTGTAACTGATGAACCAAAAACAGATAAAGATGTTAAA 
AAAT T AG GT C AG G AC GAT G C AGGT TAT AC GAT T GG T GAAG AAT T C AAAT G G T T C T T GAAA 
TCTACAATCCCTGCCAATTTAGGTGACTATGAAAAATTTGAAATTACTGATAAATTTGCA 
GAT G G CT T G AC T TAT AAAT CT GT T GGAAAAAT CAAGAT T G GT T CG AAAAC AC T G AAT AG A 
GATGAGCACTACACTATTGATGAACCAACAGTTGATAACCAAAATACATTAAAAATTACG 
T T T AAAC C AG AG AAAT T T AAAG AAAT T G C T G AG CT AC T T AAAG G AAT G AC C C T T GT T AAA 
AAT CAAGAT G CT CT T G AT AAAG CT ACT G C AAAT AC AG AT GAT G CGG C AT T T T T GGAAAT T 
CC AGT T G CAT C AAC T AT T AAT G AAAAAG C AGT T T T AGG AAAAG C AAT T GAAAAT AC T T T T 
GAACTT CAAT AT GAC C AT ACT CCTGAT AAAGCTGAC AAT CC AAAACC AT CT AAT CCT CCA 
AG AAAAC C AG AAG T T CAT ACT GGT GGG AAAC GAT T T GT AAAG AAAG AC T C AAC AGAAAC A 
CAAACACTAGGTGGTGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGG 
ACAGATGCTCTTATTAAAGCGAATACTAATAAAAACTATATTGCTGGAGAAGCTGTTACT 
G GG C AAC CAAT C AAAT T G AAAT C AC AT AC AGACGG T AC GT T T G AG AT T AAAGGT T T GG CT 
TAT G C AGT T GAT G C G AAT G C AG AG GG T AC AG C AGT AAC T T AC AAAT T AAAAG AAAC AAAA 
G C AC C AG AAGGT T AT GT AAT C C C T GAT AAAG AAAT CG AGT T T AC AGT AT C AC AAAC AT C T 
TAT AAT AC AAAAC C AACT GAC AT C AC GGT T GAT AGT G C T GAT G C AAC AC C T GAT AC AAT T 
AAAAACAACAAACGTCCTTCAATCCCTAATACTGGTGGTATTGGTACGGCTATCTTTGTC 
GCTATCGGTGCTGCGGTGATGGCTTTTGCTGTTAAGGGGATGAAGCGTCGTACAAAAGAT 
AAC 

SEQ ID NO. 8702 

STRAIN 090 

GCAGAAGTGTC AC AAG AACGCCCAGCG AAAAC 

AGCAGTAAATATCTATAAATTACAAGCTGATAGTTATAAATCGGAAATTA 
CTTCTAATGGTGGTATCGAGAAT.AAAGACGGCGAAGTAATATCTAACTAT 
GCTAAACTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAAACG 
T TAT AAAGT C AAG AC GGAT AT T T CT G T T GAT G AAT T G AAAAAAT T G AC AA 
CAGTTGAAGCAGCAGATGCAAAAGTTGGAACGATTCTTGAAGAAGGTGTC 
AGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGA 
T T C AAAAAGT AAT G T GAG AT ACT T GT AT G TAG AAG AT T T AAAGAAT T C AC 
CTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTACCA 
GTTGCTAACTCTACAGGTACAGGTTTCCTTTcTGAAATTAATATTTACCC 
T AAAAAC G T T GT AAC T GAT G AAC C AAAAAC AG AT AAAG AT GT T AAAAAAT 
TAG G T C AGG AC GAT G C AG GT TAT AC GAT T G G T GAAG AAT T C AAAT G GT T C 
TTGAAATCTACAATCCCTGCCAATTTAGGTGACTATGAAAAATTTGAAAT 
T AC T GAT AAAT T T G C AGAT GG C T T GAC T TAT AAAT CT GT T G G AAAAAT C A 
AG AT T GGT T C GAAAAC ACT G AAT AG AG AT GAG C AC T AC AC T AT T GAT G AA 
C C AAC AGTT G AT AAC C AAAAT AC AT T AAAAAT T ACGT T T AAAC C AGAG AA 
AT T T AAAG AAAT T G C T GAG C T AC T T AAAG G AAT G ACC CTT GT T AAAAAT C 
AAGATGCTCTTGATAAAGCTACTGCAAATACAGATGATGCGGCATTTTTG 
G AAAT T C C AGT T G CAT C AAC TAT T AAT G AAAAAG C AGT T T T AG G AAAAGC 
AAT T GAAAAT AC T T T T G AAC T T CAAT AT GAC CAT ACT CCT GAT AAAG C T G 
ACAATCCAAAACCATCTAATCCTCCAAGAAAACCAGAAGTTCATACTGGT 
G G G AAAC GAT T T G T AAAG AAAG ACT C AAC AG AAAC AC AAAC AC TAG GT G G 
TGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGACAG 
AT G CT C T TAT T AAAG C G AAT ACT AAT AAAAAC TAT AT T G C T G GAG AAG C T 
GTTACTGGGCAACCAATCAAATTGAAATCACATACAGACGGTACGTTTGA 
GATTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAGCAG 
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T AAC T T AC AAAT TAAAAGAAAC AAAAG C AC C AG AAGGT T AT GT AAT C CCT 
GAT AAAG AAATCGAGTTT ACAGT AT CAC AAACAT CT TATAAT ACAAAACC 
AACTGACAT CACGGTTGATAGTGCT GATGCAAC AC CT GAT AC AATTAAAA 
AC AAC AAACGT CCT T C A 

SEQ ID NO. 8703 

STRAIN A909 

GCAGAAGTGTCACAAGAACGCCCAGCGAA 

AAC AAC AG T AAAT AT C T AT AAAT T AC AAG C T GAT AGT T AT AAAT C GG AAA 
TTACTTCTAATGGTGGTATCGAGAATAAAGACGGCGAAGTAATATCTAAC 
TATGCTAAACTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAA 
ACG T T AT AAAGT C AAG AC G GAT AT T T CT GT T GAT G AAT T G AAAAAAT T GA 
C AAC AGT T GAAG C AGC AG AT GC AAAAGT T GG AACG AT T CT T G AAG AAG GT 
GTCAGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCT 
GGAT T CAAAAAGT AAT GT GAGAT ACTTGT ATGT AGAAGAT TT AAAGAATT 
CACCTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTA 
CCAGTTGCTAACTCTACAGGTACAGGTTTCCTTTCTGAAATTAATATTTA 
CCCTAaaAACGTTGTAACTGATGAACCAAAAACAGATAAAGATGTTAAAA 
AAT T AGG T C AG G AC GAT G C AGGT TAT ACG AT T GGT GAAG AAT T C AAAT GG 
TTCTTGAAATCTACAATCCCTGCCAATTTAGGTGACTATGAAAAATTTGA 
AAT T AC T GAT AAAT T T G C AGAT G G C T T G AC T TAT AAAT C T GT T GGAAAAA 
T C AAG AT T GGT T CG AAAAC AC T GAAT AG AGAT GAG C ACT AC AC TAT T GAT 
GAACCAACAGTTGATAACCAAAATACATTAAAAATTACGTTTAAACCAGA 
GAAATTT AAAGAAATTGCT GAGCT ACTT AAAGGAATGACC CT T GT TAAAA 
AT C AAG AT G C T C T T GAT AAAG C T AC T G C AAAT AC AG AT GAT G C GG CAT T T 
T T GG AAAT T C C AGT T GC AT C AAC TAT T AAT GAAAAAG C AGT T T T AGG AAA 
AGC AAT T G AAAAT AC T T T T G AACT T C AAT AT G AC CAT AC t C C T GAT AAAG 
C T G AC AAT C C AAAAC CAT CT AAT C CT C CAAG AAAAC C AG AAG T T CAT ACT 
G GT G GG AAAC GAT T T GT AAAGAAAG AC T C AAC AG AAAC AC AAAC ACT AGG 
TGGTGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGA 
CAGATGCTCTTATTAAAGCGAATACTAATAAAAACTATATTGCTGGAGAA 
GCT GT TACT GGGC AAC CAAT CAAATTGAAAT CAC AT ACAGAC GGT ACGTT 
T G AG AT T AAAGGT T T G G C T T AT G C AGT T GAT G C G AAT GC AG AGGGT AC AG 
C AG T AACT T AC AAAT T AAAAG AAAC AAAAG CAC C AGAAG GT T AT GT AAT C 
C CT GAT AAAG AAAT C GAGT T T AC AGT AT CAC AAAC AT C T TAT AAT AC AAA 
AC C AACT G AC AT CAC G GT T GAT AGT GCT GAT G C AAC AC C T GAT AC AAT T A 
AAAACAACAA 

SEQ ID NO. 8704 

STRAIN 18RS21 

GCAGAAGT GT CAC AAG AACG C C C AG C G AAAAC 

AGCAGTAAATATCTATAAATTACAAGCTGATAGTTATAAATCGGAAATTA 
CTTCTAATGGTGGTATCGAGAATAAAGACGGCGAAGTAATATCTAACTAT 
G C T AAAC T T GG T G AC AAT GT AAAAGGT T T G C AAG GT GT AC AGT T T AAACG 
T TAT AAAGT CAAG AC G GAT AT T T CT GT T GAT GAAT T G AAAAAAT T G AC AA 
C AGT TGAAG C AG C AG AT G C AAAAGT T G G AAC GAT T CT T GAAG AAGGT GT C 
AGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGA 
TT CAAAAAGT AAT GT GAGAT ACT T GT AT GT AGAAGAT T TAAAGAATT CAC 
CTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTACCA 
GTTGCTAACTCTACAGGTACAGGTTTCCTTTCTGAAATTAATATTTACCC 
T AAAAAC GT T GT AAC T GAT GAAC C AAAAAC AG AT AAAG AT GT T AAAT AAT 
TAGGTCAGGACGATGCAGGTTATACGATTGGTGAAGAATTCAAATGGTTC 
T T G AAAT CT AC AAT C C CT G C CAAT T T AGG T G ACT AT G AAAAAT T T G AAAT 
TACTGATAAATTTGCAGATGGCTTGACTTATAAATCTGTTGGAAAAATCA 
AGAT T GGT T C G AAAAC AC T GAAT AG AG AT GAG C AC T AC AC T ATT GAT G AA 
C C AAC AGT T GAT AAC C AAAAT AC ATT AAAAAT T AC GTTT AAAC CAg AG AA 
AT T T AAAG AAAT T G C T GAG C T ACT T AAAGG AAT G AC C CT T G T T AAAAAT C 
AAG AT G CT C T T GAT AAAG C T AC T GC AAAT AC AG AT GAT G C GGC AT T T T T G 
G AAAT T C C AGT T G CAT C AAC TAT T AAT G AAAAAGC AGT T T T AG GAAAAG C 
AAT T G AAAAT ACT T T T G AAC T T CAAT AT G AC CAT AC T C C T G At AAAG C t G 
AC AAT C C AAAAC CAT C T AAT CCT C CAAG AAAAC CAG AAGT T C AT ACT G GT 
GG G AAAC GAT T T GT AAAGAAAG ACT C AAC AG AAAC AC AAAC ACT AGGT G G 
TGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGACAG 
ATGCTCTTATTAAAGCGAATACTAATAAAAACTATATTGCTGGAGAAGCT 
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GT T ACT G G GC AAC C AAT C AAAT T GAAAT C AC AT AC AG AC G GT AC G T T T G A 
GATTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAGCAG 
T AACT T ACAAAT T AAAAGAAAC AAAAG C AC C AGAAGGT T AT GT AAT C C CT 
G AT AAAGAAAT CG AGT T T AC AGT AT C AC AAAC AT C T T AT AAT AC AAAACC 
AACT G AC AT C AC GGT T GAT AGT GCT GAT G C AAC AC C T GAT AC AAT T AAAA 
ACAACAAACGTCCTTCA 

SEQ ID NO. 8705 

STRAIN M732 

GCAGAAGTGT CACAAGAACGCCCAGCGAAAACAACAGT 
AAATATCTATAAATTACAAGCTGATAGTTATAAATCGGAAATTACTTCTA 
AT GGT G GT AT C G AG AAT AAAGAC G G C G AAG T AAT AT C T AAC TAT G C T AAA 

CTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAAACGTTATAA 
AGT C AAG ACG G AT AT T T C T GT T GAT G AAT T G AAAAAAT T G AC AAC AG t T G 
AAG C AG C AGAT G C AAAAG T T GG AACG AT T C T T G AAG AAG GT GT C AG T CT A 
CCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGATTCAAA 
AAG T AAT G T G AGAT AC T T GT AT GT AG AAG AT T T AAAG AAT T C AC C T T C AA 
AC AT T AC C AAAGC T TAT GCT GT AC C G T T T G T GT T GG AAT TAG C AGT T G CT 
AACTCTACAGGTACAGGTTTCCTTTCTGaAATTAATATTTACCCTAAAAA 
CGT T GT AAC T GAT G AAC CAAAAAC AGAT AAAGAT GT T AAAAAAT TAGGT C 
AGG AC GAT G C AG GT TAT AC GAT T G GT G AAG AAT T C AAAT GG T T CT T G AAA 
TCTACAATCCCTGCCAATTTAGGTGACTATGAAAAATTTGAAATTACTGA 
T AAAT T T G C AG AT G G CT T G ACT TAT AAAT C T GT T G GAAAAAT C AAGAT T G 
G T T C G AAAAC ACT G AAT AGAG AT GAG C AC T AC AC TAT T GAT G AAC C AAC A 
GTTGATAACCAAAATACATTAAAAATTACGTTTAAACCAGAGAAATTTAA 
AG AAAT T G C T GAG C T AC T T AAAG G AAT G AC C C T T G T T AAAAAT C AAG AT G 
CTCTTGATAAAGCTACTGCAAATACAGATGATGCGGCATTTTTGGAAATT 
CCAGTTGCAT CAACT ATT AAT GAAAAAGCAGT T TT AGGAAAAGC AAT T GA 
AAAT ACT T T T G AAC T T C AAT AT G AC CAT ACT C C T GAT AAAG C T G AC AAT C 
C AAAAC CAT C T AAT C C T C C AAG AAAAC C AGAAGT T CAT ACT GGT G GG AAA 
C GAT T T GT AAAG AAAG AC T C AAC AG AAAC AC AAAC ACT AGG T G GT GCT G A 
GTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGACAGATGCTC 
T TAT T AAAG CG AAT AC T AAT AAAAAC T AT AT T G C T G GAG AAG C T G T T AC T 
G GGC AAC C AAT C AAAT T GAAAT C AC AT AC AGAC G GT AC GT T T GAG AT T AA 
AG GTTTGGCT TAT G C AGT T G AT G C G AAT G C AG AG G GT AC AG C AGT AACT T 
ACAAAT T AAAAG AAAC AAAAG C AC C AG AAGGT T AT GT AAT C C C T GAT AAA 
GAAAT C G AGT T T AC AGT AT C AC AAAC AT CT TAT AAT AC AAAAC CAACT G A 

CATCACGGTTGATAGTGCTGATGCAACACCTGATACAATTAAAAACAACA 
AACGTCCTTCA 



SEQ ID NO. 8706 

STRAIN COH1 

G C AGAAGT GT C AC AAGAAC G C C C AG CGAAAAC 

AGC AGT AAAT AT C TAT AAAT T AC AAG C T GAT AG T TAT AAAT C G GAAAT T A 
C T T n T AAT GGT G GT AT C GAGAAT AAAG ACG G C G AAG T AAT AT C T AAC TAT 

G CT AAACT T G GT GAC AAT GT AAAAG GTTTGCAAGGTGT AC AGT TT AAAC G 
T T AT AAAGT C AAG AC GGAT AT T T CT GTT GAT G AAT T GAAAAAAT T GAC AA 
C AGT T G AAG C AG C AGAT G C AAAAGT T G G AAC GAT T C T T G AAG AAGGT GT C 

AGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGA 
T T C AAAAAGT AAT GT GAG AT AC T T GT AT G TAG AAG AT T T AAAG AAT T C AC 

CTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTACCA 
G T T G C T AAC T C T AC AGGT AC AGGT TTCCTTTCT GAAAT T AAT AT T T AC C C 
T AAAAACGT T GT AACT GAT G AAC CAAAAAC AGAT AAAGAT GT T AAAAAAT 
T AG GT C AGG AC GAT G C AG GT TAT AC GAT T G GT G AAG AAT T C AAAT G GT T C 

TTGAAATCTACAATCCCTGCCAATTTAGGTGACTATGAAAAATTTGAAAT 
TACT GAT AAAT T T GC AG AT G G CT T G AC T TAT AAAT C T GT T G GAAAAAT C A 
AGAT T G G T T C GAAAAC ACT G AAT AG AG AT GAG C AC T AC AC T AT T GAT G AA 
CC AAC AGT T GAT AAC C AAAAT AC AT T AAAAAT T AC GTT T AAAC C AG AG AA 
ATTTAAAGAAATTGCTGAGCTACTTAAAGGAATGACCCTTGTTAAAAATC 
AAGAT GCT CT T G AT AAAGCT AC T G C AAAT AC AG AT GAT G C G G C AT T T T T G 
GAAAT T C C AGT T G CAT CAACT AT T AAT G AAAAAG C AG T T T T AGG AAAAG C 
AAT T G AAAAT ACT T T T G AA C T T C AAT AT GAC CAT AC T C C T GAT AAAG C T G 
AC AAT C C AAAAC CAT CT AAT C CT C C AAG AAAAC C AG AAGT T CAT AC T G GT 
GGGAAACGATTTGTAAAGAAAGACTCAACAGAAACACAAACACTAGGTGG 
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TGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGACAG 
AT G CT C T T AT T AAAG C G AAT AC T AAT AAAAAC TAT AT T GC T G G AG AAG C T 
G T T AC T G G G C AAC C AAT C AAAT T G AAAT C AC AT AC AG AC G GT AC GT T T G A 
GATTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAGCAG 
T AAC T T AC AAAT T AAAAG AAAC AAAAG C AC C AGAAG G TT AT G T AAT C C CT 
GAT AAAG AAAT C GAGT T T AC AGT AT CAC AAACAT CTT AT AAT ACAAAACC 
AAC T G AC AT C AC G GT T GAT AGT G C T GAT GC AAC AC CT G AT AC AAT T AAAA 
ACAACAAACGTCCTTCA 

SEQ ID NO. 8707 

STRAIN M7 81 

G C AGAAGT G T CAC AAG AAC G C C C AG CGAAAAC AG 

C AG T AAAT AT C TAT AAAT T AC AAG C T GAT AGT TAT AAAT C G G AAAT T AC T 
T CT AAT G GT GG TAT CG AG AAT AAAG AC GG C G AAG T AAT AT CT AACT AT G C 
TAAACTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAAACGTT 
AT AAAGT C AAG AC G GAT AT T T CT GT TGAT G AAT T G AAAAAAT T G AC AAC A 
GT T GAAG C AG C AGAT G C AAAAGT T G G AAC GAT T CT T G AAGAAGG T GT C AG 
TCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGATT 
C AAAAAGT AAT G T GAG AT AC T T G TAT GT AGAAGATT TAAAGAATT C AC CT 
TCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTACCAGT 
TGCTAACTCTACAGGTACAGGTTTCCTTTCTGaAATTAATATTTACCCTA 
AAAACGT T GT AAC T GAT G AAC C AAAAAC AG AT AAAG AT G T T AAAAAAT T A 
GGT C AGG AC GAT G C AG G T T AT ACG AT T GGT GAAG AAT T C AAAT G GT T C T T 
G AAAT CT AC AAT C C CT G C C AAT T T AG GT G AC TAT G AAAAAT T T GAAAT T A 
CTGATAAATTTGCAGATGGCTTGACTTATAAATCTGTTGGAAAAATCAAG 
AT T G GT T C G AAAAC ACT G AAT AG AG AT GAG CAC T AC AC TAT T GAT G AAC C 
AAC AGT T GAT AAC C AAAAT AC AT T AAAAAT T AC GT T T AAAC C AG AG AAAT 
TTAAAGAAATTGCTGAGCTACTTAAAGGAATGACCCTTGTTAAAAATCAA 
GATGCT CTT GAT AAAGCT ACT GC AAAT ACAGATGATGCGGCATTTTTGGA 
AAT T C C AGT T G C AT C AACT AT T AAT G AAAAAG C AGT T T T AGG AAAAG C AA 
T T G AAAAT AC T T T T GAACT T C AAT AT G AC CAT ACT C CT GAT AAAG C T GAC 
AATCCAAAACCATCTAATCCTCCAAGAAAACCAGAAGTTCATACTGGTGG 
GAAACGATTT GTAAAGAAAGACT CAACAGAAACAC AAAC ACTAGGT GGT G 
CTG AGT T T GAT TTGTTGGCTTCT GAT G GG AC AG C AGT AAAAT GG AC AG AT 
GCTCTTATTAAAGCGAATACTAATAAAAACTATATTGCTGGAGAAGCTGT 
TAG T GG G C AAC C AAT C AAAT T GAAAT C AC AT AC AG ACGGT AC GT T T GAGA 
TTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAGCAGTA 
AC T T AC AAAT TAAAAGAAAC AAAAG CAC C AG AAGG T T AT GT AAT C C C T G A 
T AAAG AAAT C GAGT T T AC AGT AT CAC AAAC AT C T TAT AAT AC AAAAC C AA 
C T GAC AT C AC G GT T GAT AGT G C T GAT G C AAC AC C T GAT AC AAT T AAAAAC 
AAC AAAC GT 

SEQ ID NO. 8708 

STRAIN CJB110 

G C AGAAG T GT C AC AAGAAC GC C C AG C G AA 

AAC AG C AGT AAAT AT C TAT AAAT T AC AAG C T GAT AG T TAT AAAT T G GAAA 
TTACTTCTAATGGTGGTATCGAGAATAAAGACGGCGAAGTAATATCTAAC 
TATGCTAAACTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAA 
AC GT TAT AAAG T C AAG AC GG AT AT T T CT G T T GAT G AAT T G AAAAAAT T G A 
C AAC AGT T G AAG C AG C AG AT G C AAAAGT T G G AAC GAT T CT T GAAG AAGG T 
GTCAGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCT 
GG AT T C AAAAAGT AAT G T GAG AT AC T T G T AT G TAG AAG AT T T AAAG AAT T 
CACCTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTA 
CCAGTTGCTAACTCTACAGGTACAGGTTTCCTTTCTGAAATTAATATTTA 
CCCTAAAAACGTTGTAACTGATGAACC AAAAAC AGAT AAAGATGTT AAAA 
AATTAGGTCAGGACGATGCAGGTTATACGATTGGTGAAGAATTCAAATGG 
T T C T T GAAAT CT AC AAT C C CT G C C AAT T T AG GT G ACT AT G AAAAAT T T G A 
AAT T AC T GAT AAAT T T G C AG AT G G CT T GAC T TAT AAAT C T GT T GG AAAAA 
T C AAG AT T G GT T C G AAAAC AC T G AAT AG AG AT GAG C AC T AC ACT AT T GAT 
G AAC C AAC AGT T GAT AAC C AAAAT AC AT T AAAAAT T AC G T T T AAAC C AG A 
GAAAT T T AAAG AAAT T GCT GAG C T AC T T AAAGG AAT GAC CC T T GT T AAAA 
AT C AAG AT G C T C T T GAT AAAG CT AC T G C AAAT AC AG AT GAT G C GG C AT T T 
T T G GAAAT T C C AGT T G CAT C AACT AT T AAT G AAAAAGC AGT T T TAG GAAA 
AGCAATTGAAAATACTTTTGAACTTCAATATGACCATACTCCTGATAAAG 
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C T G AC AAT c C AAAAC CAT C T AAT C C T C C AAG AAAAC C AG AAGT T CAT AC T 
GGT GGG AAACGAT T T GT AAAG AAAG ACT C AAC AGAAAC AC AAAC ACT AGG 
TGGTGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGA 
C AG AT G C T CT T AT T AAAG C G AAT AC T AAT AAAAACT AT AT T G CT GGAG AA 
GCTGTTACTGGGCAACCAATCAAATTGAAATCACATACAGACGGTACGTT 
TGAGATTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAG 
C AGT AAC T T AC AAAT T AAAAG AAAC AAAAGC AC C AG AAGGT TAT GT AAT C 
CC T GAT AAAG AAAT C G AGT T T AC AGT AT C AC AAAC AT C T TAT AAT C C AAA 
AC C AAC T G AC AT C AC GG T T GAT AGT G C T GAT G C AAC AC CT GAT AC AAT T A 
AAAACAACAAACGTCCTTCA 

SEQ ID NO. 8709 

STRAIN JM9130013 

G C AGAAG T GT C AC AAG AAC G C C C AG C G AAAAC AG C AGT A 
AATATCTATAAATTACAAGCTGATAGTTATAAATCGGAAATTACTTCTAA 
T GGT G G T AT C G AG AAT AAAGAC G G C G AAG T AAT AT CT AACT AT G CT AAAC 
T T GG T G AC AAT GT AAAAGGT T T G C AAG G T GT AC AGT T T AAAC GT TAT AAA 
GT C AAG AC G GAT AT T T C T GT T GAT GAAT T G AAAAAAT T G AC AAC AGT T GA 
AG C AG C AG AT G CAAAAGT T GG AAC GAT T C T T G AAG AAGGT GT CAGT C T AC 
CTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGATTCAAAA 
AGTAATGTGAGATACTTGTATGTAGAAGATTTAAAGAATTCACCTTCAAA 
CAT T AC C AAAG CT T AT GC T GT AC CGTTTGTGT T GG AAT T AC C AGT T G CT A 
ACTCTACAGGTACAGGTTTCCTTTCTGAAATTAATATTTACCCTAAAAAC 
G T T G T AACT GAT GAAC C AAAAAC AG AT AAAG AT GT T AAAAAAT T AGG T C A 
G G AC GAT GC AGGT T AT AC GAT T G GT G AAG AAT T C AAAT GGT T CT T G AAAT 
CT AC AAT CC CTG C C AAT T T AG GT G ACT AT G AAAAAT T T GAAAT T ACT GAT 
AAATTTGCAGATGGCTTGACTTATAAATCTGTTGGAAAAATCAAGATTGG 
T T C G AAAAC AC T GAAT AG AG AT G AG C ACT AC AC TAT T GAT G AAC C AAC AG 
T TGAT AACCAAAAT ACAT T AAAAAT TACGT T TAAAC CAGAGAAATT T AAA 
GAAAT T GCT G AGCT AC T T AAAGGAAT G AC C C T T GT T AAAAAT C AAG AT G C 
T CT T GATAAAGCT ACTGC AAAT AC AGAT GAT GCGGCATT TT T GGAAATT C 
C AGT T G CAT C AAC TAT T AAT G AAAAAGC AG T T T TAG G AAAAGC AAT T G AA 
AATACTTTTGAACTTCAATATGACCATACTCCTGATAAAGCTGACAATCC 
AAAAC CAT C T AAT c CT c C AAGAAAAC C AG AAGT T CAT ACT G GT G G G AAAC 
GAT TTGTAAAGAAAGACTC AAC AGAAAC ACAAAC ACT AGGTGGTGCTGAG 
TTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGACAGATGCTCT 
T AT T AAAG CG AAT AC T AAT AAAAAC TAT AT T G CT G GAG AAG C T G T T ACT G 
G G C AAC C AAT C AAAT T GAAAT C AC AT AC AG AC G GT AC GT T T GAG ATT AAA 
GGTTTGGC T T AT G C AGT T GAT G C GAAT G C AG AGGG T AC AG C AG T AAC T T A 
CAAAT T AAAAGAAACAAAAGCAC CAGAAGGT TATGT AAT CC CT GAT AAAG 
AAATCGAGT T TAG AGT AT C ACAAAC AT CT TAT AAT ACAAAACCAACT GAC 
AT C AC G GT T GAT AGT GCT GAT G C AAC AC C T GAT AC AAT T AAAAAC AAC AA 
ACGTCCTTCA 

SEQ ID NO. 8710 
STRAIN 2603 frame: 1 

MKLSKKLLFSAAVLTMVAGSTVEPVAQFATGMSIVRAAEVSQERPAKTTVNIYKLQADSY 
KSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFKRYKVKTDISVDELKKLTTVEAAD 
APCVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLEL 
PVANSTGTGFLSEINIYPKNVVTDE PKTDKDVKKLGQDDAGYTIGEEFKWFLKSTIPANL 
GDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHYTIDEPTVDNQNTLKITFKPEKFK 
E I AE LLKGMT L VKNQ DAL DKAT ANT D DAAFLE I P VAS T INEKAVLGKAI ENT FE LQ Y DHT 
PDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLGGAEFDLLAS DGTAVKWTDALIKA 
NTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVI 
PDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNKRPSIPNTGGIGTAIFVAIGAAVM 
AFAVKGMKRRTKDN 

SEQ ID NO. 8711 

STRAIN 090 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
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TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAE FDLLAS DGTAVKWT DAL IKANTNKN Y I AGE AVTGQP IKLKS HT DGT FE I KGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
RPS 

SEQ ID NO. 8712 

STRAIN 18RS21 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYK VKT D I S V DE LKKL T T VE AAD AKVGT I LE E G V S L P QKTN AQG L W DAL D S KS N VR Y L Y 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNWTDEPKTDKDVK.LGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
TIDE PT VDN QN T LK I T FK PE K FKE I AE L LKGMT L VKNQ D AL DKAT AN T D D AA F L E I P VA S 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAE FDLLAS DGTAVKWT DAL IKANTNKNYIAGEAVTGQPIKLKSHTDGTFE I KGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
RPS 

SEQ ID NO. 8713 

STRAIN M7 32 frame: 1 

AEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLWDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNWTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKE IAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
RPS 

SEQ ID NO. 8714 

STRAIN M781 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLWDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKE IAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAE FDLLAS DGTAVKWT DALIKANTNKNYIAGEAVTGQP IKLKS HT DGT FEIKGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
R 

SEQ ID NO. 8715 

STRAIN COH1 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKSEITXNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLWDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
RPS 

SEQ ID NO. 8716 

STRAIN CJB110 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKLEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNPKPTDITVDSADATPDTIKNNK 
RPS 
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SEQ ID NO. 8717 

STRAIN JM9130013 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYBCVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLWDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
T I DE PT VDNQNT LKI T FKPEKFKE I AELLKGMTLVKNQDALDKAT ANT DDAAFLE I P VAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVD 

ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
RPS 

SEQ ID NO. 8718 

STRAIN A909 frame: 1 

AEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKT D I S VDE LKKLTT VE AADAKVGT I LEEGVS L PQKTNAQGLWDALDSKSNVRYL Y 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPPCNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVD 
ANAEGTAVTYKLKETKAPEGWIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNN 

SEQ ID NO. 8801 
STRAIN 2603 

AT G C CT AAG AAG AAAT C AG AT AC C C C AG AAAAAG AAGAAGT T GT CT T AACG G AAT G G C AA 
AAG C GT AAC C T T G AAT T TT T AAAAAAAC G C AAAG AAG AT G AAG AAG AAC AAAAAC G T AT T 
AACG AAAAAT T AC G C T T AG AT AAAAG AAGT AAAT T AAAT AT TTCTTCTCCT G AAGAAC CT 
C AAAAT ACT ACT AAAAT T AAGAAGCT T CATTT T CCAAAGATTT CAAGACCT AAGATTGAA 
AAG AAAC AG AAAAAAGAAAAAAT AGT C AAC AG C T T AGC C AAAAC T AAT C G CAT TAG AAC T 
GCACCTATATTTGTAGTAGCATTCCTAGTCATTTTAGTTTCCGTTTTCCTACTAACTCCT 
T T T AG T AAG C AAAAAAC AAT AAC AGT T AGT G G AAAT C AG CAT AC AC C T GAT GAT AT T T T G 
AT AG AG AAAAC G AAT AT T C AAAAAAAC GAT TAT TTCTTTTCT T T AAT T T T T AAAC AT AAA 
GCTATTGAACAACGTTTAGCTGCAGAAGATGTATGGGTAAAAACAGCTCAGATGACTTAT 
C AAT T T C C C AAT AAG T T T CAT AT T C AAGT T C AAGAAAAT AAG AT T AT T G CAT AT G C AC AT 
ACAAAGCAAGGATAT CAAC CTGT CTT GGAAACTGGAAAAAAGGCT GAT CCT GTAAAT AGT 
T C AG AG CT ACC AAAG C AC T T CT T AAC AAT T AAC CT T G AT AAGG AAG AT AGT AT T AAG C T A 
TT AATTAAAGATTTAAAGGC TT TAGAC CCT GAT TT AAT AAGT GAG AT T CAGGT GAT AAGT 
T T AG C T GAT T CT AAAAC G AC AC C T G AC CT C CT G C T G T TAG AT AT G C AC G ATG G AAAT AGT 
ATT AGAAT ACC AT T AT CTAAAT TT AAAGAAAGACT T CCTT T TT AC AAACAAATT AAG AAG 
AACCTTAAGGAACCTTCTATTGTTGATATGGAAGTGGGAGTTTACACAACAACAAATACC 
AT T G AAT CAAC C C CT GT T AAAG C AGAAG AT AC AAAAAAT AAAT C AACT GAT AAAAC AC AA 
AC AC AAAAT G G T C AG GT T G C G G AAAAT AGT C AAG GAC AAAC AAAT AAC T C AAAT ACT AAT 
C AACAAGGAC AAC AGAT AGC AACAGAGCAGGCACCT AAC CCTC AAAAT GTT AAT 

SEQ ID NO. 8802 
STRAIN H36B 

CCT AAGAAG AAAT C AG AT AC C C C AG AAAAAG AAG AAG T T 
GTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGCAA 
AGAAG AT G AAG AAG AAC AAAAAC GT AT T AAC G AAAAAT TACG C T T AG AT A 
AAAGAAGT AAAT T AAAT AT T T CTT CT CCTGAAGAAC CT C AAAAT ACT ACT 
AAAATT AAGAAG CTT CAT T T T CC AAAGATT T CAAGAC CTAAGATT GAAAA 
G AAAC AG AAAAAAGAAAAAAT AGT CAAC AG CTT AG C C AAAACT AAT C G C A 
TTAGAACTGCACCTATATTTGTAGTAGCATTCCTAGTCATTTTAGTTTCC 
GTTTTCCTACTAACTCCTTTTAGTAAGCAAAAAACAATAACAGTTAGTGG 
AAAT CAGCATACACCT GAT GAT ATTTTGATAGAGAAAACGAAT ATT CAAA 
AAAACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCTATTGAACAA 
C GT T TAG C T G C AG AAG AT G T AT G G G T AAAAAC AG C T C AG AT G ACT TAT C A 
ATTT CC C AAT AAGTTT CAT ATT CAAGTT C AAGAAAAT AAGAT TAT T GCAT 
AT G C AC AT AC AAAG C AAGGAT AT CAAC CTGTCTTG G AAAC T GG AAAAAAG 
GCTG AT C CTGT AAAT AGTTCAGAGCT ACC AAAGCACTTCTTAACAATTAA 
CCTT GAT AAGG AAG AT AGT AT T AAG C TAT T AAT T AAAG AT T T AAAG G C T T 
TAGAC CCT GAT T T AAT AAG T GAG AT T CAGGT GAT AAGT T T AGCT GAT T CT 
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AAAACGACACCTGACCTCCTGCTGTTAGATATGCACGATGGAAATAGTAT 
T AG AAT AC CAT TAT C T AAAT T T AAAG AAAG ACT T C C T T T T T AC AAAC AAA 
T T AAG AAG AAC C T T AAG G AAC C T T C TAT T GT T GAT AT G GAAGT G G G AGT T 
T AC AC AAC AAC AAAT AC CAT T G AAT C AAC C C CT GT T AAAG C AGAAGAT AC 
AAAAAAT AAAT C AAC T G AT AAAAC AC AAAC AC AAAAT G GT C AG GT T G C GG 
AAAAT AGT CAAGGAC AAAC AAAT AAC T C AAAT AC T AAT C AAC AAG G AC AA 
C AGAT AG C AAC AGAG C AGG C AC C T AAC C C T C AAAAT GT T AAT 

SEQ ID NO. 8803 
STRAIN 18RS21 

C C T AAGAAG AAAT C AG AT AC C C C AGAAAAAG AAGAAGT T 
GTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGCAA 
AGAAGAT G AAG AAG AAC AAAAAC GT AT T AAC G AAAAAT T ACGC T T AGAT A 
AAAG AAGT AAAT T AAAT AT TTCTTCTCCT G AAG AAC C T C AAAAT AC TACT 
AAAAT T AAG AAG C T T C ATT T TC C AAAG AT T T C AAG AC CT AAG AT T GAAAA 
G AAAC AG AAAAAAGAAAAAAT AGT C AAC AG C T T AGC C AAAAC T AAT C GC A 
T TAG AAC T G C AC C TAT AT T T GT AGT AGC AT T C C T AG T CAT T T T AGT T T C C 
GTTTTCCTACTAACTCCTTTTAGTAAGCAAAAAACAATAACAGTTAGTGG 
AAAT CAGC AT ACACCT GAT GAT ATT TTGAT AGAGAAAACGAATAT TC AAA 
AAAACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCTATTGAACAA 
CGT T T AGCTGC AGAAGAT GTATGGGT AAAAAC AGCTCAGATG ACT TATC A 
AT T T C C C AAT AAGT T T CAT AT T C AAGT T CAAG AAAAT AAGAT TAT T GC AT 
AT G C AC AT AC AAAG C AAGGAT AT C AAC C T GT CT T GGAAAC T GGAAAAAAG 
GC T GAT C C T GT AAAT AG T T C AG AGCT AC C AAAGC AC T T C T T AAC AAT T AA 
CCTTGATAAGGAAGATAGTATTAAGCTATTAATTAAAGATTTAAAGGCTT 
TAGACCCTGATTTAATAAGTGAGATTCAGGTGATAAGTTTAGCTGATTCT 
AAAAC G AC AC C T G AC CT C C T G C T G T TAG AT AT G C AC GAT G G AAAT AGT AT 
TAG AAT AC CAT TAT C T AAAT T T AAAGAAAGACT T C CT T T T T AC AAAC AAA 
TTAAGAAGAACCTTAAGGAACCTTCTATTGTTGATATGGAAGTGGGAGTT 
T AC AC AAC AAC AAAT AC CAT T GAAT C AAC C C C T GT T AAAG C AG AAGAT AC 
AAAAAAT AAAT CAACT GAT AAAAC AC AAAC AC AAAAT GGT CAGGT T GCGG 
AAAAT AG T C AAG G AC AAAC AAAT AAC TC AAAT ACT AAT C AAC AAG G AC AA 
C AG AT AG C AAC AG AG C AG G C AC CT AAC C CT C AAAAT GT T AAT 

SEQ ID NO. 8804 
STRAIN M732 

C C T AAG AAG AAAT C AG AT AC C C C AG AAAAAG AAG AAG 
TTGTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGC 
AAAG AAG AT GAAG AAG AAC AAAAACGT AT T AAC GAAAAAT T AC G CT TAG A 
TAAAAG AAGT AAAT T AAAT AT TTCTTCTCCT G AAG AAC CT C AAAAT AC T A 
CT AAAAT T AAG AAGC T T CAT T T T C C AAAG AT T T C AAAAC C T AAG AT T G AA 
AAG AAAC AGAAAAAAG AAAAAAT AGT C AAC AG C T TAG C C AAAACT AAT CG 
CATTAGAACTGCACCTATATTTGTAGTAGCATTCCTAGTCATTTTAGTTT 
C C GT T T T C CT AC T AACT C CT T T T AGT AAG CAAAAAAC AAT AAC AGT T AGT 
G G AAAT C AG CAT AC AC C T GAT GAT AT T T T GAT AGAAAAAACG AAT AT T C A 
AAAAAACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCTATTGAAC 
AAC G T T T AGC T G C AG AAG AT GT AT G G GT AAAAAC AG CT C AG AT G ACT TAT 
C AAT T T C C C AAT AAGT T T CAT AT T C AAGT T CAAG AAAAT AAG AT TAT T G C 
AT AT G C AC AT AC AAAG C AAGGAT AT C AG C C T GT C T T GGAAAC T G G AAAAA 
AGGCTGATCCTGT AAAT AGT TCAGAGCTACCAAAGCACTTCTTAACAATT 
AAC C T T G AT AAGG AAG AT AG T AT T AAG CT AT T AAT T AAAG AT T T AAAGG C 
T TT AG AC C C T GAT T T AAT AAGT GAG AT T C AG GT GAT AAGT T T AG CT GAT T 
C T AAAAC G AC AC C T G AC CTCCTGCTG T TAG AT AT G CAT GAT GG AAAT AGT 
ATTAGAATACCATTATCTAAATTTAAAGAAAGACTTCCTTTTTACAAACA 
AAT T AAG AAG AAC CT T AAGG AAC C T T CT AT T GT T GAT AT GG AAGT G GG AG 
TTTACACAACAACAAGTACTATTGAATCAACCCCTGTGAAAGCGGAAGAT 
AC AAAAAAT AAAT C AAC T GAT AAAAC AC AAAC AC AAAAT GGT C AG GT T G C 
GG AAAAT AG T CAAGGAC AAAC AAAT AAC T C AAAT AC T AAT C AAC AAGGAC 
AAC AG AT AG C AAC AG AG C AG G C AC C C AAC C CT C AAAAT GT T AAT 

SEQ ID NO. 8805 
STRAIN COH1 

C CT AAG AAG AAAT C AG AT AC C C C AGAAAAAG AAG AAGT T 
GTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGCAA 
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AG AAG AT GAAGAAGAAC AAAAAC G T AT T AAC G AAAAAT T AC G CT T AGAT A 
AAAGAAGT AAAT T AAAT AT TTCTTCTCCT G AAG AAC C T C AAAAT AC T AC T 
AAAAT T AAGAAGCT T CAT T T T C C AAAG AT T T C AAAAC C T AAGAT T G AAAA 
GAAACAGAAAAAAGAAAAAATAGT CAACAGCTT AGC CAAAACTAATCGCA 
T T AGAAC T G CAC CT AT AT T T GT AGT AG CAT T C C T AGT C AT T T T AGT T T CC 
GT T T T C CT AC T AAC T C C T T T TAG T AAGC AAAAAACAAT AAC AGT T AGT GG 
AAAT C AG CAT AC AC CT GAT GAT AT T T T GAT AG AAAAAAC G AAT AT T CAAA 
AAAAC GAT TAT TTCTTTTCTT T AAT T T T T AAAC AT AAAG CT AT T GAAC AA 

CGTTTAGCTGCAGAAGATGTATGGGTAAAAACAGCTCAGATGACTTATCA 
AT T T C C C AAT AAGT T T CAT AT T C AAGT T C AAG AAAAT AAG AT TAT T G CAT 
AT G C ACAT AC AAAGCAAGG AT AT C AG CCT GT CT T G GAAACT GG AAAAAAG 
GC T GAT C C T GT AAAT AGT T C AG AG C T AC C AAAGC ACT T CT T AAC AAT T AA 
C C T T GAT AAGG AAGAT AGT AT T AAG C TAT T AAT T AAAG AT T T AAAGG C T T 
TAG AC CCT GAT T T AAT AAG T G AGAT T C AGGT GAT AAGT T TAG C T GAT T CT 
AAAAC G AC AC C T G AC CTCCTGCTGT TAG AT AT G CAT GAT G G AAAT AG TAT 
T AG AAT AC CAT T AT C T AAAT T T AAAG AAAG AC TTCCTTTT T AC AAAC AAA 

TTAAGAAGAACCTTAAGGAACCTTCTATTGTTGATATGGAAGTGGGAGTT 
T AC AC AAC AAC AAGT AC TAT T G AAT CAAC CC C T GT G AAAG CGG AAG AT AC 
AAAAAAT AAAT CAACT GAT AAAAC AC AAAC AC AAAAT GG T C AGGT T G CGG 
AAAAT AGT CAAGGAC AAAC AAAT AAC TC AAAT ACT AAT CAACAAGGACAA 
C AG AT AG C AAC AGAG CAG G CAC C CAAC CCT C AAAAT GT T AAT 

SEQ ID NO. 8806 ' 
STRAIN M7 81 

C C T AAGAAG AAAT CAG AT AC C C C AG AAAAAG AAG AAG 

TTGTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGC 
AAAG AAG AT G AAG AAGAAC AAAAAC GT AT T AAC GAAAAAT T AC GCT T AG A 
T AAAAGAAG T AAAT T AAAT AT TTCTTCTCCT GAAG AAC CT C AAAAT AC T A 
C T AAAAT T AAG AAG C T T CAT T T T C C AAAG AT T T C AAAAC C T AAG AT T G AA 
AAGAAAC AG AAAAAAG AAAAAAT AGT CAAC AG C T TAG C C AAAAC T AAT CG 
CAT T AG AACT G CAC C TAT AT T T GT AG T AGC AT T C C T AGT CAT T T T AGT T T 
CCGTTTTC C TACT AAC T C C T T T T AGT AAG C AAAAAAC AAT AAC AG T T AGT 
G GAAAT CAG CAT AC AC C T GAT GAT AT T T T GAT AG AAAAAAC G AAT AT T C A 

AAAAAACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCT ATT GAAC 
AAC GT T TAG C T G C AGAAG AT G TAT G G G T AAAAAC AG CT C AGAT GAC T TAT 
C AAT T T C C C AAT AAGT T T CAT AT T C AAG T T C AAG AAAAT AAGAT TAT T G C 
AT AT G CAC AT AC AAAG C AAGG AT AT CAG C C T GT C T T GG AAACT G G AAAAA 
AG GCT GAT CCT GT AAAT AG TT CAG AGC T AC C AAAG CAC T T C T T AAC AAT T 
AAC C T T G AT AAGGAAG AT AGT ATT AAG C TAT T AAT T AAAG AT T T AAAGG C 

TTTAGACCCTGATTTAATAAGTGAGATTCAGGTGATAAGTTTAGCTGATT 
CT AAAAC GAC AC CT GAC C T CC T GC T GT T AG AT AT G CAT G AT GG AAAT AGT 

ATTAGAATACCATTATCTAAATTTAAAGAAAGACTTCCTTTTTACAAACA 
AAT T AAG AAGAAC C T T AAG GAAC C T T C T AT T GT T G AT AT GG AAGT GGG AG 
T T T AC AC AAC AAC AAG TACT AT T G AAT CAAC C C C T GT GAAAGC G G AAGAT 

ACAAAAAATAAATCAACTGATAAAACACAAACACAAAATGGTCAGGTTGC 
GGAAAATAGT CAAGGACAAACAAATAACT CAAATACTAAT C AACAAGGAC 
AAC AG AT AG CAAC AG AGC AG GC AC C C AAC CCT C AAAAT GT T AAT 

SEQ ID NO. 8807 
STRAIN CJB110 

C C T AAG AAGAAAT CAG AT AC CC CAG AAAAAG AAG AAG 

TTGTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGC 
AAAG AAG AT GAAG AAGAAC AAAAAC G TAT T AAC G AAAAAT T ACG CT T AG A 
TAAAAG AAGT AAAT T AAAT AT TTCTTCTCCT G AAGAAC C T C AAAAT AC T A 
C T AAAAT T AAG AAG C T T CAT T T T C C AAAG AT T T C AAAAC C T AAG AT T G AA 
AAG AAAC AG AAAAAAGAAAAAAT AGT CAAC AG C T TAG C C AAAAC T AAT CG 
CATTAGAACTGCACCTATATTTGTAGTAGCATTCCTAGTCATTTTAGTTT 
CCGTTTTCCTACTAACTCCTTTTAGTAAGCAAAAAACAATAACAGTTAGT 
GG AAAT CAG CAT AC AC CT G AT GAT AT T T T GAT AG AAAAAAC G AAT AT T C A 

AAAAAACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCTATTGAAC 
AAC G T T T AG C T G CAG AAG AT G TAT GGG T AAAAAC AG C T CAG AT G ACT TAT 
C AAT T T C C C AAT AAG T T T CAT AT T C AAGT T C AAG AAAAT AAG AT TAT T G C 
AT AT G CAC AT AC AAAGC AAG GAT AT CAG CCTGTCTTG GAAACT G G AAAAA 
AG GCT GAT C C T GT AAAT AGT T C AGAG C T AC C AAAGC AC T T C T T AAC AAT T 
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AACCTTGATAAGGAAGATAGTATTAAGCTATTAATTAAAGATTTAAAGGC 
T T T AGAC C C T GAT T T AAT AAGT G AG AT T C AGGT G AT AAGT T T AGC T G ATT 
C T AAAAC GAC AC C T G AC CT CCTGCTGT T AGAT AT GC AT GAT GG AAAT AGT 
ATTAGAATACCATTATCTAAATTTAAAGAAAGACTTCCTTTTTACAAACA 
AAT T AAG AAG AAC C T T AAGG AAC C T T C TAT TGT T GAT AT GG AAGT GGG AG 
T T T AC AC AAC AAC AAGT AC TAT T GAAT C AACC C CT GT GAAAG C G G AAG AT 
AC AAAAAAT AAAT C AAC T GAT AAAACAC AAAC AC AAAAT GGT C AGGT T GC 
G GAAAAT AGT C AAGGAC AAAC AAAT AACT C AAAT AC T AAT C AAC AAG GAC 
AAC AG AT AG C AAC AG AG C AGG C AC C CAAC C C T C AAAAT GT T AAT 

SEQ ID NO. 8808 
STRAIN 1169NT 

C CT AAGAAG AAAT C AGAT AC C CC AGAAAAAG AAGAAGT 

TGTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGCA 
AAGAAG AT G AAG AAG AAC AAAAAC GT AT T AAC G AAAAAT T AC G CT TAG AT 
AAAAG AAGT AAAT T AAAT AT TTCTTCTCCT GAAG AAC CT C AAAAT AC T AC 
T AAAAT T AAG AAG C T T CAT T T T C C AAAG AT T T C AAAAC C T AAG AT T GAAA 

AGAAACAGAAAAAAGAAAAAATAGTCAACAGCTTAGCCAAAACTAATCGC 
AT TAG AAC T G C AC C TAT AT T TAT AGT AG CAT T C C T AGT CAT T T T AGT T T C 
CGTTTTCC T AC T AAC T C CT TT T AGT AAGC AAAAAAC AAT AAC AGT T AGT G 
G AAAT C AGC AT AC AC C T GAT GAT AT T T T GAT AG AG AAAACG AAT AT T C AA 
AAAAAC GAT TAT TTCTTTTCTT T AAT T T T T AAAC AT AAAG C TAT T G AAC A 
AC GT T TAG CT G C AG AAGAT G TAT GG GT AAAAAC AG CT C AG AT GACTT AT C 
AAT T T C C CAAC AAG T T T CAT AT T C AAGT T C AAG AAAAT AAG AT TAT T G C A 
T A t G C AC AT AC AAAG C AAG GAT AT C AG C C T G T C T T GG AAAC T G G AAAAAA 
GG C T GAT C CT GT AAAT AGT T C AG AG CT AC C AAAGC AC T T C T T AAC AAT T A 
AC C T T GAT AAG GAAG AT AG TAT T AAG C T AT T AAT T AAAG AT T T AAAGGC T 
T TAG AC C C T GAT T T AAT AAGT GAG AT T C AGGT GAT AAG T T TAG C T GAT T C 
T AAAAC GAC AC C T GAC C TCCTGCTGT TAG AT AT G C AC GAT G G AAAT AG T A 
T TAG AAT AC CAT TAT C T AAAT T T AAAG AAAG AC T T C C TT T T T AC AAAC AA 
AT T AAG AAG AAC C T T AAG G AAC C T T CT AT T G T T GAT AT GG AAGT G G GAGT 
T T AC AC AAC AAC AAGT AC TAT T GAAT CAAC C CC T GT GAAAG CG GAAG AT A 
C AAAAAAT AAAT CAAC T GAT AAAAC AC AAAC C C AAAAT GGT C AGGT T G C G 
GAAAAT AGT CAAGGACAAACAAAT AACT CAAAT ACT AAT CAAC AAGGACA 
AC AAC AGAT AG C AAC G G AG C AGG C AC C CAAC C C T C AAAAT GT T AAT 

SEQ ID NO. 8809 
STRAIN JM9130013 

C C T AAG AAG AAAT C AG AT AC C C C AGAAAAAG AAGAAG T T 

GTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGCAA 
AGAAG AT GAAG AAG AAC AAAAAC GT AT T AAC G AAAAAT T AC G C T T AGAT A 
AAAG AAG T AAAT T AAAT ATT T CT T CT C C T GAAG AAC C T C AAAAT AC TACT 
AAAA T T AAG AAG C T T CAT T T T C C AAAG AT T T C AAG A C C T AAGAT T G AAAA 
G AAAC AG AAAAAAG AAAAAAT AG T CAAC AG C T T AG C C AAAACT AAT C GC A 
TTAGAACTGCACCTATATTTGTAGTAGCATTCCTAGTCATTTTAGTTTCC 
GTTTTCCTACTAACTCCTTTT AGT AAGC AAAAAAC AAT AAC AGTTAGTGG 
AAAT C AG CAT AC AC CT GAT GAT AT T T T GAT AG AG AAAAC GAAT AT T C AAA 
AAAACG AT TAT TTCTTTTCTT T AAT T T T T AAAC AT AAAG C TAT T G AAC AA 

CGTTTAGCTGCAGAAGATGTATGGGTAAAAACAGCTCAGATGACTTATCA 
AT T T C C C AAT AAG T T T CAT AT T C AAGT T C AAGAAAAT AAG AT TAT T G CAT 
AT G C AC AT AC AAAGC AAGG AT AT CAAC CTGTCTTG GAAAC T G G AAAAAAG 
G CT GAT C C T G T AAAT AG T T C AG AG C T AC C AAAG C AC T T C T T AAC AAT T AA 
C C T T GAT AAGG AAG AT AGT AT T AAG C TAT T AAT T AAAGAT T T AAAG G CT T • 
TAG AC C C T GAT T T AAT AAGT GAG AT T C AGGT GAT AAGT T T AG CT G AT T C T 
AAAAC GAC AC C T GAC CTCCTGCTGT TAG AT AT G C AC GAT G G AAAT AG TAT 
TAG AAT AC CAT TAT C T AAAT T T AAAG AAAG AC TTCCTTTT T AC AAAC AAA 
T T AAG AAG AAC C T T AAGG AAC C TT CT AT T GT T GAT AT G G AAGT GG GAGT T 
T AC AC AAC AAC AAAT AC CAT T GAAT C AACC C C T GT T AAAG C AG AAG AT AC 
AAAAAAT AAAT CAAC T GAT AAAAC AC AAAC AC AAAAT GGT C AG GT T G CGG 
AAAAT AG T C AAG GAC AAAC AAAT AAC T CAAAT AC T AAT CAAC AAG GAC AA 
C AG AT AG CAAC AG AG C AG G C AC C T AAC C CT C AAAAT G T T AAT 

SEQ ID NO. 8810 
STRAIN A909 



387 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



C C T AAG AAG AAAT C AG AT AC C C C AG AAAAAGAAG AAGT T GT C 

TTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTaaAAAAACGCAAAGA 
AGAT GAAGAAGAa CAAAAACGT ATT AACGAAAAAT TACGCTTAGATAAAA 
GAAGT AAAT T AAAT AT TTCTTCTCCT GAAGAAC CT C AAAAT ACT AC T AAA 
AT T AAGAAG C T T CAT T T T C C AAAGAT T T C AAGAC C T AAGAT T G AAAAG AA 
AC AG AAAAAAG AAAAAAT AGT C AAC AGC T TAG C CAAAACT AAT C G C AT T A 
G AACT GC AC C TAT AT T T GT AG TAG CAT T C C T AGT CAT T T T AGTT T C C G T T 
TTCCTACTAACTCCTTTTAGTAAGCAAAAAACAATAACAGTTAGTGGAAA 
T CAGCAT ACAC CT GAT GAT AT T T T G AT AGAG AAAAC G AAT AT T C AAAAAA 
ACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCTATTGAACAACGT 
TTAGCTGCAGAAGATGTATGGGTAAAAACAGCTCAGATGACTTATCAATT 
T C C C AAT AAGT T T CAT AT T C AAGT T CAAGAAAAT AAGAT TAT T G CAT AT G 
CACATAC AAAGCAAGGAT AT CAACCT GT CT T GGAAACT GGAAAAAAGGCT 
GAT C C T GT AAAT AGT T CAGAG CT AC C AAAG C AC T T C T T AAC AAT T AAC C T 
TGATAAGGAAGATAGTATTAAGCTATTAATTAAAGATTTAAAGGCTTTAG 
AC C C T GAT T T AAT AAGT GAG AT T C AGG T GAT AAGT T TAG C T GAT T C T AAA 

ACGACACCTGACCTCCTGCTGTTAGATATGCACGATGGAAATAGTATTAs 
, AAT AC CAT TAT C T AAAT T T AAAG AAAGACT T C C T T T T T AC AAAC AAAT T A 
AG AAGAAC CT T AAGG AAC C T T CT AT T GT T GAT AT GGAAGT GGGAGT T T AC 
AC AAC AAC AAAT AC CAT T G AAT C AAC C C C T G T T AAAGC AG AAGAT AC AAA 
AAAT AAAT C AAC T GAT AAAAC AC AAmC AC AAAAT G GT C AGGT T G C G G AAA 
AT AGT C AAG G AC AAAC AAAT AACT C AAAT ACT AAT C AAC AAGG AC AAC AG 
AT AG C AAC AGAG C AG GC AC CT AAC C C T C AAAAT GT T AAT 

SEQ ID NO. 8811 
STRAIN 090 

T AAG AAG AAAT C AG AT AC C C C AGAAAAAG AAG AAG T T G T C T T AACGGAAT 
GG C AAAAG C GT AAC C T T G AAT T T T T AAAAAAAC G C AAAG AAG AT GAAG AA 
GAAC AAAAACGT AT T AAC GAAAAAT T AC G C T TAG AT AAAAG AAG T a a aT T 
AAAT AT T T C T T C T C CT GAAGAAC C T C AAAAT AC T AC T AAAAT T AAG AAGC 
T T CAT T T T C C AAAG AT T T C AAAAC C T AAG AT T G AAAAGAAAC AGAAAAAA 
GAAAAAAT AGT C AAC AGC T TAG C CAAAACT AAT C G CAT TAG AAC T G C AC C 
TATATTTGTAGTAGCATTCCTAGTCATTTTAGTTTCCGTTTTCCTACTAA 
CTCCTTTTAGTAAGCAAAAAACAATAACAGTTAGTGGAAATCAGCATACA 
C C T GAT GAT AT T T T GAT AGAAAAAAC G AAT AT T C AAAAAAAC GAT TAT T T 

CTTTTCTTTAATTTTTAAACATAAAGCTATTGAACAACGTTTAGCTGCAG 
AAGAT G T AT G G GT AAAAAC AG C T C AG AT G AC T TAT C AAT T T C C C AAT AAG 
T T T CAT AT T C AAGT T CAAGAAAAT AAG AT TAT T GC AT AT G CACATAC AAA 
G C AAGG AT AT C AG CCTGTCTT GG AAAC T G G AAAAAAGG C T GAT C CT GT AA 
AT AG T T C AG AG CT AC C AAAG C AC T T C T T AAC AAT T AAC CT T G AT AAGGAA 
GAT AGT AT T AAG C T AT T AAT T AAAG AT T T AAAG G C T T TAG AC C C T GAT T T 
AATAAGTGAGATTCAGGTGATAAGTTTAGCTGATTCTAAAACGACACCTG 
ACCTCCTGCTGTTAGATATGCATGATGGAAATAGTATTAGAATACCATTA 
T CT AAAT T T AAAGAAAG AC TTCCTTTT T AC AAAC AAAT T AAG AAGAAC C T 
T AAGG AAC CT T CT AT T GT T GAT AT G GAAGT G G GAG T T T AC AC AAC AAC AA 
GT AC TAT T G AAT C AAC C C C T GT G AAAG C GGAAG AT AC AAAAAAT AAAT C A 
AC T GAT AAAAC AC AAAC AC AAAAT GG T C AG GT T G C GG AAAAT AGT C AAG G 
AC AAAC AAAT AAC T C AAAT ACT AAT C AAC AAG G AC AAC AG AT AG C AAC AG 
AG C AG G C AC C C AAC C C T C AAAAT GT T AAT 

SEQ ID NO. 8812 
STRAIN 2 603 frame: 1 

PKKKS DT PEKEE VVLTE WQKRNLE FLKKRKE DEEE QKRINEKLRLDKRS KLNISSPEEPQ 
NTTKIKKLHFPKISRPKIEKKQKKEKIVNSLAKTNRIRTAPIFVVAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHPCAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKI I AYAHTKQGYQPVLETGKKADPVNS SEL PKHFLT INLDKE DSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 

LKEPSIVDMEVGVYTTTNTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8813 

STRAIN H3 6B frame: 1 

PKKKS DTPEKEEWLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQ 
NTTKIKKLHFPKISRPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 
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SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 

FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 

IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 

LKEPSIVDMEVGVYTTTNTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8814 

STRAIN 18RS21 frame: 1 

PKKKSDTPEKEEWLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQ 

NTTKIKKLHFPKISRPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 

SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 

FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 

IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 

LKEPSIVDMEVGVYTTTNTIESTPVPCAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8815 

STRAIN M732 frame: 1 

PKKKS DT PEKEE WLTE WQKRNLE FLKKRKE DEEEQKRINEKLRLDKRSKLNI S S PEE PQ 

NTTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 

SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 

FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 

IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 

LKEPSIVDMEVGVYTTTSTIESTPVPCAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8816 

STRAIN COH1 frame: 1 

PKKKS DTPEKEEWLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISS PEE PQ 

NTTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 

SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 

FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 

IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 

LKEPSIVDMEVGVYTTTSTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8817 

STRAIN M781 frame: 1 

PKKKSDTPEKEEWLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQ 

NTTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 

SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 

FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 

IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKPCN 

LKEPSIVDMEVGVYTTTSTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8818 

STRAIN CJB110 frame: 1 

PKKKS DTPEKEEVVLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQ 

NTTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFVVAFLVILVSVFLLTPF 

SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 

FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 

IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 

LKEPSIVDMEVGVYTTTSTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8819 

STRAIN 1169NT frame: 1 

PKKKSDTPEKEEVVLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQ 
NTTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFIVAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 
LKEPSIVDMEVGVYTTTSTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
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QGQQQIATEQAPNPQNVN 



SEQ ID NO. 8820 

STRAIN JM9130013 frame: 1 

PKKKS DT PEKEE WLTE WQKRNLE FLKKRKE DEEEQKRINEKLRLDKRSKLNI S S PEE PQ 
NTTKIKKLHFPKISRPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 

LKEPSIVDMEVGVYTTTNTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQ I ATE QAPN PQN VN 

SEQ ID NO. 8821 

STRAIN A90 9 frame: 1 

PKKKS DT PEKEE WLTE WQKRNLE FLKKRKEDEEEQKRINEKLRLDKRS KLNI S S PEE PQ 
NTTKIKKLHFPKISRPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKPCADPVNSSELPKHFLTINLDKEDSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIXIPLSKFKERLPFYKQIKBCN 

LKEPSIVDMEVGVYTTTNTIESTPVKAEDTKNKSTDKTQXQNGQVAENSQGQTNNSNTNO 
QGQQ I ATE QAPN PQN VN 

SEQ ID NO. 8822 

STRAIN 0 90 frame: 2 

KKKSDTPEKEEWLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQN 
TTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPFS 
KQKT I T V S GN QH TPDDILIEKTNI QKN D Y F FS L I FKHKAI E QR L AAE D VW VKT AQMT YQ F 

PNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLLI 

KDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKNL 

KEPSIVDMEVGVYTTTSTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQO 
GQQIATEQAPNPQNVN 

SEQ ID NO. 8901 
STRAIN 2603 

AT G AAAAAAGG AC AAG T AAAT GAT ACT AAG C AAT C T T AC T C T C T AC GT AAA 

TATAAATTTGGTTTAGCATCAGTAATTTTAGGGTCATTCATAATGGTCACAAGTCCTGTT 
T T T G CGGAT C AAAC T AC AT C GGT T C AAG T T AAT AAT C AG AC AG GC AC T AGT GT GGAT G C T 
AAT AAT T CT T C C AAT GAG AC AAG T G C GT C AAGT G T GAT T AC T T C C AAT AAT G AT AGT GT T 
C AAG C GT C T G AT AAAGT T G T AAAT AG T C AAAAT AC GG C AAC AAAG GAC AT TACT ACT C C T 

TTAGTAGAGACAAAGCCAATGGTGGAAAAAACATTACCTGAACAAGGGAATTATGTTTAT 
AGC AAAGAAAC CG AGG T G AAAAAT AC AC C T T C AAAAT C AG C C C C AG T AGC T T T C TAT G C A 
AAG AAAG GT GAT AAAGT T T T C T AT GAC C AAGT AT T T AAT AAAGAT AAT GT GAAAT G GAT T 
T CAT AT AAGT CTTTTTGT GGC GT AC G T C GAT AC G C AGC TAT T G AGT C AC T AGAT C CAT C A 
G G AG GT T C AG AG AC T AAAGC AC CT AC T C C T GT AAC AAAT T C AG G AAG C AAT AAT C AAGAG 
AAAAT AG C AAC G C AAG GAAAT TAT AC AT T T T C AC AT AAAGT AG AAGT AAAAAAT GAAG C T 
AAGG TAG C GAG T C C AACT C AAT T T AC AT T G GAC AAAG GAG AC AG AAT T T T T T ACG AC C AA 
AT AC T AAC TAT T GAAGG AAAT C AG T GGT TAT C T TAT AAAT CAT T C AAT GGTGTTCGTCGT 
TTTGTTTTG C TAG GT AAAG CAT CT T C AGT AG AAAAAAC T G AAG AT AAAGAAAAAGT GT C T 
C C T C AAC C AC AAG C C C GT AT T AC T AAAAC T G GT AGAC T GAC TAT T T C T AAC GAAAC AAC T 
AC AGGT T T T GAT AT T T T AAT T AC G AAT AT T AAAG AT GAT AAC GGT AT CGCTGCTG T T AAG 
GTACCGGTTTG GAC T G AAC AAG GAG G G C AAG AT GAT AT T AAAT G G TAT AC AG C T GT AAC T 
AC T G G G GAT G G C AAC T AC AAAGT AG C T GT AT CAT T T G C T GAC CAT AAG AAT GAG AAG GGT 
C T T TAT AAT AT T CAT T TAT AC T AC C AAG AAG C TAG T GGG AC AC T T G T AGGT G T AAC AG G A 
AC T AAAGT GAC AGT AG CT G G AAC T AAT T C T T C T C AAG AAC CT AT T G AAAAT GG T T T AG C A 
AAG AC T G GT GT T T AT AAT AT TAT C GG AAGT ACT GAAGT AAAAAAT GAAG C T AAAAT AT C A 
AGT C AG AC C C AAT T T AC T T T AG AAAAAG G T GAC AAAAT AAAT TAT GAT C AAGT AT T GAC A 
G C AGAT G GT T AC C AG T G GAT T T C T T AC AAAT C T T AT AGT G GT G T T C G T CG CT AT AT T C C T 
G T G AAAAAGC T AACT AC AAG T AGT G AAAAAG C G AAAG AT G AGG C GAC T AAAC C G AC T AGT 

TATCCCAACTTACCTAAAACAGGTACCTATACATTTACTAAAACTGTAGATGTGAAAAGT 
C AAC CT AAAGT AT C AAGT C C AGT G G AAT T T AAT T T T CAAAAG GGT G AAAAAAT AC AT TAT 
GAT C AAGT GT T AG T AGT AG AT GGT CAT C AG T G GAT T T CAT AC AAG AG T T AT T C C GG T AT T 
CGT CGCT ATATTGAAATT 

SEQ ID NO. 8902 
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STRAIN 090 

AAAAAAG GAC AAGT AAAT GAT AC T AAG C AAT CT T AC T 

CT C T AC GT AAAT AT AAAT T T GGT T T AGC AT C AGT AAT T T T AGGGT CAT T C 

ATAATGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAGT 
T AAT AAT C AG AC AGG C AC T AGT GT GGAT G CT AAT AAT T C T T C C AAT GAGA 
C AAGT GC GT C AAGT GT GAT T AC T T C C AAT AAT GAT AGT GT T C AAGCGT CT 
G AT AAAGT T GT AAAT AGT C AAAAT AC G G C AAC AAAG GAC AT TACT ACT C C 
T T T AGT AG AGAC AAAG C C AAT G GT GG AAAAAAC AT TAG CT GAACAAG GGA 
ATTATGTTTATAGCAAAGAAACCGAGGTGAAAAATACACCTTCAAAATCA 
GCCCCAGTAGCTTTCTATGCAAAGAAAGGTGATAAAGTTTTCTATGACCA 
AGT AT TT AAT AAAG AT AAT G T GAAAT GGAT T T CAT AT AAGT CTTTTTGTG 

GCGTACGTCGATACGCAGCTATTGAGTCACTAGATCCATCAGGAGGTTCA 
G AGACT AAAG C AC C T ACT C C T GT AAC AAAT T C AGGAAG C AAT AAT C AAG A 
G AAAAT AG C AAC G C AAGG AAAT TAT AC AT T T T C AC AT AAAGT AG AAGT AA 
AAAAT G AAG c T AAGGT AG CG AGT C C AACT C AAT T T AC AT T G GAC AAAG GA 
GAC AG AAT T T T T T AC GAC C AAAT AC T AACT AT T GAAGG AAAT C AG T G GT T 

ATCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTtTGCTAGGTAAAG 
CAT CT T C AGT AG AAAAAACT G AAG AT AAAG AAAAAGT GT C T C C T C AAC C A 
C AAG C C C GT AT T AC T AAAAC T G GT AG AC T G ACT AT T T C T AAC G AAAC AAC 
TACAGGTTTTGATATTTTAATTACGAATATTAAAGATGATAACGGTATCG 
CTGCTGTTAAGGTACCGGTTTGGACTGAACAAGGAGGGCAAGATGATATT 
AAAT G GT AT AC AG C T GT AAC T AC T G GGG AT GGC AAC T AC AAAG T AG CT GT 
AT CAT T T GC T GAC CAT AAG AAT G AG AAGGGT C T T TAT AAT AT T CAT T TAT 
AC T AC C AAG AAG C T AGT G G GAC AC T T GT AG G T G T AAC AG GAAC T AAAGT G 
AC AGT AG C T GG AAC T AAT T CT T C T C AAG AAC CT AT T G AAAAT G GT T TAG C 
AAAG AC T GGT G T T TAT AAT AT TAT CGG AAGT AC T G AAG T AAAAAAT G AAG 
C T AAAAT AT C AAG T C AG AC C C AAT T T AC T T T AG AAAAAG GT GAC AAAAT A 
AAT TAT GAT C AAGT AT T G AC AGC AGAT GGT T AC C AGT GGAT T T C T T AC AA 
AT C T TAT AG TGGTGTTCGT CG CT AT AT T C CT GT GAAAAAG C T AAC T AC AA 
GTAGTGAAAAAGCGAAaGATGAGGCGACTAAACCGACTAGTTATCCCAAC 
TTACCTAAAACAGGTACCTATACATTTACTAAAACTGTAGATGTGAAGAG 
TCAACCT AAAGT AT C AAGT CC AGT GGAATTT AAT TTTCAAAAGGGTGAAA 
AAAT AC AT TAT GAT C AAG T G T T AGT AGT AG AT GGT CAT C AG T G GAT T T C A 
T AC AAG AGT TAT T C C GGT AT T C G T C G CT AT AT T GAAAT T 

SEQ ID NO. 8903 

STRAIN A909 

AAAAAAGGAC AAG T AAAT GAT ACT AAG C AAT CT T AC 

T C T C T AC G T AAAT AT AAAT T T G G T T TAG CAT C AGT AAT T T T AG GGT CAT T 

CATAATGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAG 
T T AAT AAT C AG AC AG G C AC TAG T GT G GAT G CT AAT AAT T C T T C C AAT GAG 
AC AAGT G C G T C AAG T GT GAT T AC T T C C AAT AAT GAT AGT GT T C AAG C GT C 
T GAT AAAGT T GT AAAT AGT C AAAAT AC G GC AAC AAAGG AC AT T AC T AC T C 
C T T T AGT AG AG AC AAAG C C AAT GGT GG AAAAAAC AT T AC C T GAAC AAG GG 
AAT TAT GT T TAT AG C AAAG AAAC C G AGGT G AAAAAT AC AC CT T C AAAAT C 
AGC C C C AGT AG C T TT C T AT G C AAAGAAAGG T GAT AAAGT T T T CT AT GAC C 
AAGT AT T T AAT AAAGAT AAT G T GAAAT GGAT T T CAT AT AAGT CTTTTTGT 

GGCGTACGTCGATACGCAGCTATTGAGTCACTAGATCCATCAGGAGGTTC 
AGAGAC T AAAG C AC C T AC T C C T GT AAC AAAT T C AG G AAG C AAT AAT C AAG 
AG AAAAT AGC AAC G C AAGG AAAT TAT AC AT T T T C AC AT AAAG TAG AAGT A 
AAAAAT G AAG C T AAG G T AG CG AGT C C AACT C AAT T T AC AT T GG AC AAAG G 
AG AC AG AAT T T T T T AC GAC C AAAT AC T AAC TAT T G AAG GAAAT C AGT GGT 
TATCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTtTGCT AGGT AAA 
G CAT CT T C AGT AG AAAAAACT G AAG AT AAAG AAAAAGT GT C T C C T C AAC C 
ACAAGCCCGTATTACTAAAACTGGTAGACTGACTATTTCTAACGAAACAA 
C T AC AGGT T T T GAT AT T T T AAT T AC G AAT AT T AAAG AT GAT AAC GGT AT C 
GCTGCTGT T AAG GTACCGGTTTG GAC T GAACAAG GAG G G C AAG AT GAT AT 

TAAATGGTATACAGCTGTAACTACTGGGGATGGCAACTACAAAGTAGCTG 
TAT CAT T T G C T GAC CAT AAG AAT GAG AAG G GT C T T TAT AAT AT T CAT T T A 
T AC TAG C AAG AAG CT AG T GGG AC AC T T GT AG G T G T AAC AG GAAC T AAAGT 

GACAGTAGCTGGAACTAATTCTTCTCAAGAACCTATTGAAAATGGTTTAG 
C AAAG AC T GGT G T T TAT AAT AT TAT C GG AAG T AC T G AAGT AAAAAAT G AA 
G C T AAAAT AT C AAG T C AG AC C C AAT T T AC T T TAG AAAAAG GT GAC AAAAT 
AAAT TAT GAT C AAGT AT T GAC AG C AG AT G GT T AC C AGT GGAT T T C T T AC A 
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AAT CT T AT AGT GG T GT T CG T C G C TAT AT T C C T GT G AAAAAG CT AAC T AC A 
AGT AGT GAAAAAG CG AAAGAT GAGGC GAC T AAAC C G AC T AGT T AT C C C AA 
C T TAG C T AAAAC AG GT AC C TAT AC AT T T ACT AAAACT GT AGAT GT G AAG A 

GTCAACCTAAAGTATCAAGTCCAGTGGAATTTAATTTTCAAAAGGGTGAA 
AAAAT AC AT TAT GAT C AAGT GT T AGT AGT AGAT GGT CAT C AGT G GAT T T C 
AT AC AAG AGT TAT T C C GGT AT T CGT CG C T AT AT T GAAAT T 

SEQ ID NO. 8904 

STRAIN H3 6B 

AAAAAAGG AC AAGT AAAT GAT AC T AAGC AAT C T T AC T 
C T C T AC GT AAAT AT AAAT T T GGT T TAG CAT C AG T AAT T T TAG GGT C ATT C 
AT AAT GGT C AC AAGT CCTGTTTTT GC GGAT C AAAC T AC AT C GG T T C AAGT 
T AAT AAT C AG AC AG G C AC T AGT GT GGAT GAT AAT AAT T C T T C C AAT GAGA 
C AAGT G CGT CAAG T GT GAT T ACT T C C AAT AAT G AT AGT GT T C AAG CGT C T 
GAT AAAG T T G T AAAT AGT C AAAAT AC G G C AAC AAAGG AC AT TACT ACT C C 
T T T AGT AGAGAC AAAGC C AAT GGT GG AAAAAAC AT T AC C T GAAC AAG GG A 
AT TAT GT T TAT AG C AAAGAAAC C GAGGT G AAAAAT AC AC CT T C AAAAT CA 
G C C C C AGT AGC T T T CT AT GC AAAG AAAGGT GAT AAAGT T T T CT AT G AC C A 
AGT AT T T AAT AAAG AT AAT G T GAAAT GGAT T T CAT AT AAG TCTTTTTGTG 

GCGTACGTCGATACGCAGCT ATT GAGTCACT AGAT CCATCAGGAGGTTCA 
GAGAC T AAAG C AC C T AC T C C T GT AAC AAAT T CAGG AAGC AAT AAT CAAG A 
G AAAAT AG C AAC GCAAGG AAAT TAT ACATTTT CACATAAAGT AGAAGTAA 
AAAAT GAAG C T AAG G T AGC GAGT C C AAC T C AAT T T AC AT T GG AC AAAG G A 
G AC AGAAT T T T T T ACG AC C AAAT ACT AAC TAT T G AAGG AAAT C AGT G GT T 
AT C T TAT AAAT CAT T C AAT GG T GT T CGT CGTTTTGTTtTG C T AGGT AAAG 
CAT C T T C AGT AG AAAAAACT G AAG AT AAAGAAAAAGT G T C T C C T C AAC C A 
C AAGC C C GT AT T AC T AAAAC T GGT AGAC T GAC TAT T T C T AAC G AAAC AAC 
T AC AGGT T T T GAT AT T T T AAT T AC G AAT AT T AAAG AT GAT AAC G GT AT C G 

CTGCTGTTAAGGTACCGGTTTGGACTGAACAAGGAGGGCAAGATGATATT 
AAAT GGT AT AC AG C T G T AAC T AC T GG G GAT G G C AAC T AC AAAGT AG CT GT 
AT CAT T T G C T GAC CAT AAG AAT GAG AAG GGT C T T TAT AAT AT T CAT T TAT 
AC T AC CAAG AAG CT AGT G G GAC AC T T GT AG G T GT AAC AgG AAC T AAAGT G 
AC AG TAG C T G GAAC T AAT T C T T C T CAAG AAC C T AT T G AAAAT GGT T TAG C 
AAAG AC T GGT GT T TAT AAT AT TAT C GG AAGT AC T G AAGT AAAAAAT GAAG 
CT AAAAT AT C AAGT C AGAC C C AAT T T ACT T T AG AAAAAG G T GAC AAAAT A 
AAT TAT GAT C AAGT AT T GAC AG C AG AT GGT T AC C AGT GGAT T T C T T AC AA 

ATCTTATAGTGGTGTTCGTCGCTATATTCCTGTGAAAAAGCTAACTACAA 
G TAG T GAAAAAG CG AAAG AT GAG G C GAC T AAAC CG AC TAG T TAT C C C AAC 
T T AC C T AAAAC AGGT AC C TAT AC AT T TACT AAAACT GT AG AT GT GAAG AG 
T C AAC C T AAAGT AT C AAGT C C AGT G G AAT T T AAT T T T C AAAAGGGT G AAA 
AAAT AC AT TAT GAT C AAGT GT T AGT AG TAG AT GGT CAT C AG T GGAT T T C A 
T AC AAGAGT T AT T C CG G T AT T CGT C G C TAT AT T GAAAT T 

SEQ ID NO. 8905 

STRAIN 18RS21 

AAAAAAG GAC AAG T AAAT GAT AC T AAG C AAT CT T AC T C 

T C T ACG T AAAT AT AAAT T T GGT T TAG CAT C AG T AAT T T T AGGG T CAT T C A 

TAATGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAGTT 
AAT AAT C AG AC AGG C AC TAG T GT G GAT G C T AAT AAT T C T T C C AAT GAGAC 

AAGTGCGTCAAGTGTGATTACTTCCAATAATGATAGTGTTCAAGCGTCTG 
AT AAAGTTGT AAAT AGT C AAAAT ACGGCAACAAAGGACATT ACT ACT CCT 
T T AGT AGAGAC AAAG C C AAT GG T G G AAAAAAC AT TAG CT GAAC AAG G G AA 
T TAT GT T T AT AG C AAAG AAAC C G AGG T G AAAAAT AC AC C T T C AAAAT C AG 

CCCCAGTAGCTTTCTATGCAAAGAAAGGTGATAAAGTTTTCTATGACCAA 
GT AT T T AAT AAAG AT AAT GT GAAAT G GAT T T CAT AT AAG TCTTTTTGTGG 
CGTACGTC GAT AC G C AG C T AT T GAGT C ACT AG AT C CAT C AG GAGGT T C AG 
AG ACT AAAG C AC C T AC T CCT G T AAC AAAT T C AG GAAG C AAT AAT C AAGAG 
AAAAT AG C AAC G CAAG GAAAT TAT AC AT T T T CACATAAAGT AG AAGT AAA 
AAAT GAAG c T AAGGT AG C GAG T C C AAC T C AAT T T AC AT T G GAC AAAG GAG 
AC AG AAT T T T T T AC GAC C AAAT AC T AAC TAT T GAAG GAAAT C AG T GGT T A 

TCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTTTGCTAGGTAAAGC 
AT C T T C AG TAG AAAAAAC T GAAG AT AAAG AAAAAG TGTCTCCT C AAC C AC 
AAG C C C GT AT T AC T AAAAC T G G T AG AC T G ACT AT T T C T AAC G AAAC AACT 
AC AG GT T T T GAT AT T T T AAT T ACG AAT AT T AAAG AT GAT AAC G GT AT C GC 
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T G C T GT T AAGG T AC C GG T T T GG AC T GAAC AAGGAGGG C AAG AT GAT AT T A 

AATGGTATACAGCTGTAACTACTGGGGATGGCAACTACAAAGTAGCTGTA 
T CAT T T G C T GAC C AT AAGAAT GAG AAGG GT C T T T AT AAT AT T CAT T TATA 
C TAG C AAG AAGC T AGT G GGAC ACT T GT AGGT GT AAC AG GAACT AAAGT G A 
C AGT AG CT G GAACT AAT T CT T C T C AAGAAC C TAT T GAAAAT G G T T TAG C A 
AAG AC T GG T GT T T AT AAT AT TAT C GG AAGT AC T GAAGT AAAAAAT G AAGC 
T AAAAT AT C AAG T C AGAC C C AAT T T ACT T T AG AAAAAG G T G AC AAAAT AA 
AT TAT GAT C AAGT AT T G AC AGC AGAT GGT T AC C AGT G GAT T T C T T AC AAA 

T CT T ATAGTGGT GTT CGT CGCT AT ATT CCTGTGAAAAAGCTAACT ACAAG 
T AGT GAAAAAG C GAAAGAT GAG G CG ACT AAAC C GAC TAG T TAT C C C AAC T 
T AC C T AAAAC AG G T AC C TAT AC AT T T AC T AAAACT G T AGAT G T GAAAAGT 
C AAC C T AAAG T AT C AAG T C C AG T GGAAT T T AAT T T T C AAAAGGGT G AAAA 
AAT AC AT TAT GAT C AAGT GT T AGT AGT AGAT GGT CAT C AGT G GAT T T CAT 

acaagagtta!ttccggtattcgtcgctatattgaaatt 

SEQ ID NO. 8906 

STRAIN M732 

C AAGT AAAT GAT a C T AAG C AAT C T T AC T CT C T AC GT AAAT AT AAAT T T G G 
T T TAG CAT C AGT AAT T T T AGGGT CAT T C AT AAT GG T C ACAAG TCCTGTTT 
T T G C G GAT C AAAc T AC AT C G G T T C AAG T T AAT AAT C AG AC AG G C AC TAG T 
GT G GAT G C T AAT AAT T C T T C C AAT GAGACAAGT G CGT C AAGT G T G AT T AC 
T T C C AAT AAT G AT AGT GT T C AAG C G T CT GAT AAAGT T GT AAAT AGT C AAA 
AT AC GG C AAC AAAG GAC AT T AC T AC T C C T T TAG T AG AG AC AAAGC C AAT G 

GTGGAAAAAACATTACCTGAACAAGGGAATTATGTTTATAGCAAAGAAAC 
C GAGGT G AAAAAT AC AC C T T C AAAAT C AG C C C C AGT AG C T T T CT AT GC AA 
AG AAAG GT GAT AAAG T T T T CT AT G AC C AAGT AT T T AAT AAAGAT AAT GT G 
AAAT G GAT T T CAT AT AAGT CTTTTGGTGGCG T AC G T CG AT ACGC AG C TAT 
T GAG T C AC TAG AT C C AT C AGGAG GTT C AG AG AC T AAAG C AC CT ACT C C T G 

TAACAAATTCAGGAAGCAATAATCAAGAGAAAATAGCAACGCAAGGAAAT 
TAT AC AT T T T C AC AT AAAGT AGAAGT AAAAAAT GAAG C T AAGG T AGC GAG 
T C C AAC T C AAT T T AC AT T G G AC AAAGG AG AC AGAAT T T T T T AC GAC C AAA 
TAG T AAC T a t T GAAG G AAAT C AGT G G T TAT C T TAT AAAT CAT T C AAT GGT 

GTT CGT CGT TTTGtTttGcT AGGT AAAGC ATCTTC AGT AGAAAAAACTGA 
AG AT AAAG AAAAAGT G T C T C C T C AAC C AC AAG C C C GT AT T AC T AAAAC T G 
G T AGAC T GAC TAT T T C T AAC G AAAC AAC T AC AG GT T T T GAT AT T T T AAT T 
AC G AAT AT T AAAG AT G AT AAC G GT AT CGCTGCTGT T AAG G T AC CG G T T T G 
G AC T GAAC AAG GAGGG C AAG AT GAT AT T AAAT G GT AT AC AG CT GT AAC T A 
C T G G GG AT G G C AAC T AC AAAGT AG C T GT AT CAT T T G C T GAC CAT AAG AAT 

GAGAAGGGT CTT T AT AAT ATTCATTTAT ACTACCAAGAAGCTAGTGGGAC 
AC T T GT AGGT GT AAC AG GAACT AAAGT GAC AGT AG CT GG AAC T AAT T C T T 
CT C AAG AAC C TAT T GAAAAT GGT T T AC C AAAG ACT G GT GT T TAT AAT AT T 

AT CG GAAGT ACT GAAGT AAAAAATGAAGCT AAAAT AT CAAGTC AG ACCC A 
AT T T ACT T TAG AAAAAG G T GAC AAAAT AAAT TAT GAT C AAG TAT T GAC AG 

CAGATGGTTACCAGTGGATTTCTTACAAATCTTATAGTGGTGTTCGTCGC 
TAT AT T C C T G T GAAAAAG C T AAC T AC AAGT AG T GAAAAAG C GAAAGAT G A 
G G C G AC T AAAC C G AC T AGT TAT C C C AACT T AC C T AAAAC AGGT AC C T AT A 
CAT T TACT AAAAC T G T AG AT GT GAAAAGT C AAC CT AAAGT AT C AAGT C C A 
GT G G AAT T T AAT T T T C AAAAGGGT GAAAAAAT AC AT TAT GAT C AAG T GT T 
AG T AGT AG AT GGT CAT C AGT G GAT T T CAT AC AAG AG T TAT T C C G G TAT T C 
GT C G C TAT AT T G AAAT T 

SEQ ID NO. 8907 

STRAIN COH1 

AAAAAAG GAC AAG T AAAT GAT ACT AAG C AAT CTTACTCTCT 

ACGT AAAT AT AAAT T TGGTTT AGC AT C AGT AATTTT AGGGT C ATT C AT AA 
TGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAGTTAAT 
AAT C AGAC AG G C AC TAG T G T GG AT G CT AAT AAT T C T T C C AAT GAG AC AAG 

TGCGTCAAGTGTGATTACTTCCAATAATGATAGTGTTCAAGCGTCTGATA 
AAGTTGTAAATAGTCAAAATACGGCAACAAAGGACATTACTACTCCTTTA 
GT AGAG AC AAAG C C AAT GG T G G AAAAAAC AT T AC C T GAAC AAG G G AAT T A 
T G T T TAT AGC AAAG AAAC C GAG G T G AAAAAT AC AC C T T C AAAAT C AG C C C 
C AGT AG C T T T C TAT G C AAAG AAAG GT GAT AAAG T T T T C T AT GAC C AAG T A 

TTTAAT AAAGAT AATGTTAAATGGATTTCATATAAGTCTTTTGGTGGCGT 
AC G T C GAT AC G C AG CT AT T GAG T C AC TAG AT C CAT C AG GAGGT T C AG AG A 
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CT AAAG C AC C T AC T C C T GT AAC AAAT T C AG G AAG C AAT AAT C AAG AG AAA 
AT AGCAACGCAAGGAAATTAT ACATT T T CACAT AAAGT AGAAGTAaAAAA 
T GAAG c T AAG GT AG CG AGT C C AAC T C AAT T TAG AT TG G AC AAAG GAG AC A 
GAATTTTTTACGACCAAATACTAACTATTGAAGGAAATCAGTGGTTATCT 
TAT AAAT CATTCAATGGTGTTCGTCGTTTTGTTtTGCTAGGTAAAGCATC 
T T C AGT AGAAAAAACT GAAGAT AAAGAAAAAG TGTCTCCT C AAC C AC AAG 
C C CGT AT T AC T AAAACT G GT AG ACT GAC TAT T T CT AAC GAAAC AAC TAG A 
G GT T T T GAT AT T T T AAT T AC GAAT AT T AAAGAT GAT AACG GT AT CG CT GC 
T GT T AAGGT AC C G GT T T G GAC T G AAC AAGG AG GG CAAGAT GAT AT T AAAT 
GGT AT AC AG CT GT AAC T ACT G GGGAT GG C AAC T AC AAAGT AG C T GT AT C A 
T T T G C T GAC CAT AAG AAT GAGAAGG GT CT T T AT AAT AT T CAT T TAT AC T A 
C C AAG AAG C T AGT G GGAC AC T T GT AGGT GT AAC AG GAAC T AAAG T GAC AG 
T AG CT GGAAC T AAT T CT T C T C AAG AAC CT AT T G AAAAT GGT T TAG C AAAG 
AC T GG T GT T T AT AAT AT T AT C GG AAGT ACT G AAGT AAAAAAT GAAG C T AA 
AAT AT C AAG T C AG AC C C AAT T T ACT T TAG AAAAAGGT GAC AAAAT AAAT T 
AT GAT C AAGT AT T GAC AG C AGAT GG T TAG C AG T GG AT T T C T T AC AAAT C T 
T AT AGT G GT GTTCGTCGC TAT AT T C C T GT G AAAAAG CT AAC T AC AAGT AG 
T G AAAAAG C G AAAG AT GAGG C G AC T AAAC C G AC T AGT TAT C C C AAC T T AC 
C T AAAAC AG GT AC CT AT ACAT T T AC T AAAAC T GT AGAT G T G AAAAGT C AA 
CC T AAAGT AT C AAGT C C AGT GG AAT T T AAT T T T C AAAAGGG T GAAAAAAT 

ACATTATGATCAAGTGTTAGTAGTAGATGGTCATCAGTGGATTTCATACA 
AG AGT TAT T C C GGT AT T C GT C G CT AT AT T G AAATT 

SEQ ID NO. 8908 

STRAIN M781 

AAAAAAG GAC AAGT AAAT GAT AC T AAG C AAT C T T 

ACT C T C T AC GT AAAT AT AAAT T T GGT T T AGC AT C AG T AAT T T T AG GGT C A 

TT CAT AATGGTCACAAGTCCTGTTTTTGCGGATCAAACT ACAT CGGTTCA 
AGT T AAT AAT C AGAC AGG C AC T AGT GT G GAT G C T AAT AAT T C T T C C AAT G 
AG AC AAGT G C GT C AAG T GT G AT T AC T T C C AAT AAT G AT AG TGT T C AAG CG 
T CT GAT AAAGT T GT AAAT AG T C AAAAT AC GG C AAC AAAGGAC AT T AC TAG 
T C CT T T AGT AGAG AC AAAG C C AAT G GT G G AAAAAAC AT T AC CT G AAC AAG 
GG AAT TAT GT T TAT AG C AAAG AAAC C G AG GT G AAAAAT AC AC C T T C AAAA 

TCAGCCCCAGTAGCTTTCTATGCAAAGAAAGGTGATAAAGTTTTCTATGA 
C C AAGT AT T T AAT AAAG AT AAT G T G AAAT GG AT T T CAT AT AAGT CT T T T G 
GT GG C GT ACGT C GAT AC G C AG C TAT T G AG T C AC TAG AT C CAT C AG G AG GT 
T C AG AGAC T AAAG C AC CT AC T C C T GT AAC AAAT T C AGG AAG C AAT AAT C A 
AGAG AAAAT AG C AAC G C AAGGAAAT TAT AC AT T T T CACAT AAAGT AG AAG 
T AAAAAAT G AAG CT AAGG TAG C G AGT C C AAC T C AAT T T AC AT T GGAC AAA 
G GAG AC AGAAT T T T T TAG GAC C AAAT AC T AACT AT T GAAGG AAAT C AGT G 
GTTATCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTtTGCTAGGTA 
AAG CAT C T T C AGT AG AAAAAACT GAAGAT AAAGAAAAAGT GT C T C C T C AA 

CCACAAGCCCGTATTACTAAAACTGGTAGACTGACTATTTCTAACGAAAC 
AAC T AC AG GT T T T GAT AT T T T AAT TAG GAAT AT T AAAGAT GAT AAC G GT A 
TCGCTGCTGT T AAg gT AC CGG T T T G GAC T GAAC AAG G AG GG CAAGAT GAT 
ATT AAAT GGT AT AC AGCTGT AACT ACTGGGGATGGC AACT AC AAAGT AGC 
T GT AT CAT T T G C T GAC CAT AAG AAT GAG AAGG G T C T T TAT AAT AT T CAT T 

TATACTACCAAGAAGCTAGTGGGACACTTGTAGGTGTAACAGGAACTAAA 
GTGACAGTAGCTGGAACTAATTCTTCTCAAGAACCTATTGAAAATGGTTT 
AC C AAAG ACT GGT GTTT AT AAT AT TAT C GG AAGT ACT GAAGT AAAAAAT G 

AAGCT AAAAT AT CAAGT C AGAC C CAATT TACTTTAGAAAAAGGTGACAAA 
AT AAAT TAT GAT C AAGT AT T GAC AG C AG AT GGT T AC C AG T G GAT T T C T T A 

CAAATCTTATAGTGGTGTTCGTCGCTATATTCCTGTGAAAAAGCTAACTA 
CAAGT AGT G AAAAAG C G AAAG AT GAGG C GAC T AAAC C GAC T AG T TAT C C C 

AACTTACCTAAAACAGGTACCTATACATTTACTAAAACTGTAGATGTGAA 
AAGT C AAC CT AAAGT AT CAAGT C C AGTGG AATT T AAT T TT C AAAAGGGT G 
AAAAAAT AC AT TAT GAT C AAG TGT TAG T AGT AG AT GGT CAT C AG T G GAT T 
TCATACAAGAGTTATTCCGGTATTCGTCGCTATATTGAAATT 

SEQ ID NO. 8909 

STRAIN CJB110 

AAAAAAGG AC AAG T AAAT GAT AC T AAG C AAT C T T AC T C T C 

TACGTAAATATAAATTTGGTTTAGCATCAGTAATTTTAGGGTCATTCATA 

ATGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAGTTAA 
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T AAT C AGAC AG G C ACT AGT GT GG AT GCT AAT AAT T C T T C C AAT G AGAC AA 
GT G C GT C AAGT GT G AT T ACT T C C AAT AAT G AT AGT GT T C AAG C G T C T GAT 

AAAGTTGTAAATAGTCAAAATACGGCAACAAAGGACATTACTACTCCTTT 
AGT AG AGAC AAAG C C AAT GGT GGAAAAAAC AT T AC C T GAAC AAG GG AAT T 
AT GT T TAT AG C AAAGAAAC C GAGGT G AAAAAT AC AC C T T C AAAAT CAGC C 
C CAGT AG C T T T C TAT G C AAAGAAAGGT GAT AAAGT T T T C TAT G AC C AAGT 
AT T T AAT AAAGAT AAT GT GAAAT G GAT T T CAT AT AAG TCTTTTTGTGGCG 
TAG G T CG AT AC GC AGC T AT T GAGT CAC T AGAT C CAT C AGGAGGT T C AGAG 
ACTAAAGCACCTACTCCTGTAACAAATTCAGGAAGCAATAATCAAGAGAA 
AATAGCAACGCAAGGAAATTATACATTTTCACATAAAGTAGAAGTAAAAA 
AT G AAG C T AAG GT AG CGAG T C C AAC T C AAT T T AC AT T G GAC AAAGG AGAC 
AGAAT T T T T T AC GAC C AAAT AC T AAC T AT TG AAG GAAAT C AG T G G T TAT C 

TTATAAATCATTCAATGGTGTTCGTCGTTTTGTTTTGCTAGGTAAAGCAT 
C T T C AGT AG AAAAAAC T G AAGAT AAAG AAAAAGT G T C T C CT C AAC C AC AA 
G C C C GT AT T AC T AAAACT GG T AG AC T GAC TAT T T C T AAC G AAAC AAC T AC 

AGGTTTTGATATTTTAATTACGAATATTAAAGATGATAACGGTATCGCTG 
C T GT T AAGG T AC C GGT T T GGAC T GAAC AAG GAG G G C AAGAT GAT AT T AAA 
T GGT AT AC AG C T GT AAC T AC T GG GG AT GG C AAC T AC AAAG TAG CT GT AT C 
AT T T G C T GAC CAT AAG AAT GAG AAG G GT C T T TAT AAT AT T CAT T TAT ACT 
AC C AAG AAG CT AGT G G GAC AC TT GT AG GT G T AAC AGGAAC T AAAGT GAC A 
GT AG CT G GAAC T AAT T C T T C T C AAG AAC C TAT T G AAAAT GGT T T AG C AAA 
G ACT G GT GT T TAT AAT AT TAT C G GAAGT ACT G AAGT AAAAAAT G AAG C T A 
AAAT AT C AAG T C AG AC C C AAT T T ACT T TAGAAAAAGGT GAC AAAAT AAAT 
TAT GAT C AAGT AT T GAC AGC AG AT G GT T AC CAGT G GAT T T C T T AC AAAT C 
T TAT AGT GGT GT T CGT CG CT AT AT T C C T GT G AAAAAG C T AAC T AC AAG T A 
G T G AAAAAG C G AAAG AT GAG G C G AC T AAAC C GAC TAG T TAT C C C AAC T T A 
C C T AAAAC AG GT AC CT AT AC AT T T AC T AAAAC T G T AGAT GT GAAG AGT C A 
AC C T AAAGT AT C AAG T C C AGT G G AAT T T AAT T T T C AAAAGGGT G AAAAAA 
T AC AT TAT GAT C AAGT GT T AGT AGT AGAT GGT CAT C AGT GG AT T T CAT AC 
AAGAGTTATTCCGGTATTCGTCGCTATATTGAAATT 

SEQ ID NO. 8910 

STRAIN 1169NT 

AAAAAAGGACAAGTAAATGATACTAAGCAATCTTACTC 

T CT AC GT AAAT AT AAAT T T GG T T T AGC AT CAGT AAT T T T AGGGT CAT T C A 

TAATGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAGTT 

AATAAT C AGACAGGCACT AGT GTGGATGCT AAT AAT T CT T C CAATGAGAC 
AAGT G CGT C AAGT GT G AT T AC T T C C AAT AAT GAT AGT GT T C AAG C G T C T G 
AT AAAGT T GT AAAT AG T C AAAAT AC G GC AAC AAAG GAC AT TACT AC T C C T 
T T AGT AG AG AC AAAGC C AAT G G T G GAAAAAAC AT T AC C T GAAC AAG G G AA 
T TAT G T T TAT AGC AAAG AAAC C GAG G T G AAAAAT AC AC C T T C AAAAT CAG 
C C C CAGT AG C T T T C TAT G C AAAG AAAGGT GAT AAAG T T T T CT AT GAC C AA 
G T AT T T AAT AAAG AT AAT GT GAAAT GG AT T T CAT AT AAGT CTTTTGGTGG 
CGTACGTC GAT AC G C AG CT AT T GAGT CAC TAG AT C CAT CAG G AG GT T CAG 
AG AC T AAAG C AC C T AC T C CT G T AAC AAAT T C AGG AAG C AAT AAT C AAG AG 
AAAAT AG C AAC G C AAG GAAAT TAT AC AT T T T CAC AT AAAG TAG AAGT AAA 
AAAT GAAG C T AAG GT AG CGAG T C C AACT C AAT T T AC AT T GGAC AAAG GAG 
AC AGAAT T T T T T ACG AC C AAAT AC T AACT AT T GAAG GAAAT C AGT G G T T A 

TCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTTTGCTAGGTAAAGC 
AT C T T C AGT AGAAAAAAC T GAAG AT AAAG AAAAAGT G T C T C C T C AAC CAC 

AAGCC CGT ATT ACT AAAACT GGT AG ACT GACT AT TTCTAACG AAAC AACT 
AC AGGT T T T GAT AT T T T AAT T AC GAAT AT T AAAG AT GAT AAC GGT AT C G C 
T G C T G T T AAG GT AC CGGT T T G GACT GAAC AAG G AGG G C AAGAT GAT AT T A 

AATGGTATACAGCTGTAACTACTGGGGATGGCAACTACAAAGTAGCTGTA 
T CAT T T G C T GAC CAT AAG AAT G AG AAG GG T C T T TAT AAT AT T CAT T TATA 
C T AC C AAG AAG C TAG T G G GAC AC T T GT AGGT G T AAC AG GAAC T AAAG T G A 

CAGTAGCTGGAaCTAATTCTTCTCAAGAACCTATTGAAAATGGTTTAGCA 
AAG AC T G GT GT T TAT AAT AT TAT C GGAAGT AC T GAAGT AAAAAAT GAAG C 
T AAAAT AT C AAG T CAG AC C C AAT T T AC T T TAG AAAAAG G T GAC AAAAT AA 
AT TAT GAT C AAG TAT T GAC AG CAG AT GGT T AC CAG T GG AT T T C T T AC AAA 

TCTTATAGTGGTGTTCGTCGCTATATTCCTGTGAAAAAGCTAACTACAAG 
T AGT G AAAAAG C G AAAG AT G AGG C G AC T AAAC C G AC T AGT TAT C C C AAC T 
T AC C T AAAAC AG GT AC C TAT AC AT T T AC T AAAACT G T AG AT GT G AAAAG T 
C AAC C T AAAG TAT C AAGT C CAGT GG AAT T T AAT T T T CAAAAGG GT G AAAA 
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AAT AC AT TAT GAT C AAGT G T T AGT AGT AGAT GGT CAT C AGT G GAT T T CAT 
ACAAGAGTTATTCCGGTATTCGTCGCTATATTGAAATT 

SEQ ID NO. 8911 

STRAIN JM9130013 

AAAAAAG GAC AAG T AAAT GAT ACT AAG C AAT C T T AC T 
CT CT ACGTAAATAT AAATTTGGTTTAGCATCAGT AAT TTTAGGGT CAT T C 
AT AAT GGT C AC AAGT C CT GTT T T T GC GG AT C AAACT AC AT CG GT T C AAG T 
T AAT AAT C AGAC AGG C AC TAG T GT GG AT G CT AAT AAT T CTT C C AAT GAGA 

CAAGTGCGTCAAGTGTGATTACTTCCAATAATGATAGTGTTCAAGCGTCT 
GAT AAAGT T G T AAAT AGT C AAAAT AC G G C AAC AAAGG AC AT T AC T AC T C C 
T T TAG TAG AGAC AAAG C C AAT GGT GG AAAAAAC AT T AC C T GAAC AAGGGA 
AT T AT GT T TAT AG C AAAG AAAC C GAG G T GAAAAAT AC AC CTT C AAAAT C A 
G C C C C AG T AG CT T T C TAT G C AAAG AAAG G T GAT AAAGT T T T CT AT GAC C A 
AG TAT T T AAT AAAGAT AAT GT G AAAT GG AT T T CAT AT AAG TCTTTTTGTG 
GC G T ACGT C GAT ACG C AG C TAT T GAG T C AC T AGAT C CAT C AG GAGG T T C A 
GAG ACT AAAG C AC C T AC T C C T G T AAC AAAT T C AG GAAG C AAT AAT C AAG A 
G AAAAT AG CAAC G C AAG GAAAT TAT AC AT T T T C AC AT AAAGT AG AAG T AA 
AAAAT GAAG C T AAGGT AG C GAGT C C AACT C AAT T T AC AT T GGAC AAAG G A 
GAC AG AAT T T T T T AC GAC C AAAT AC T AACT AT T G AAGG AAAT CAGT G GTT 

ATCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTTTGCTAGGTAAAG 
CAT CTT C AG TAG AAAAAAC T GAAG AT AAAG AAAAAG TGTCTCCT CAAC C A 
C AAG C C C GT AT T AC T AAAACT GGT AG AC T G ACT AT T TAT AAC G AAAC AAC 
T AC AGGT T T T GAT AT T T T AAT T AC GAAT AT T AAAG AT GAT AAC G GT AT CG 

CTGCTGTT AAGGT AC CGGTTTGGACTGAACAAGGAGGGCAAGATGAT ATT 
AAAT G G T AT AC AG CT GT AAC T AC T G G GGAT GG C AAC TAG AAAGT AG C T GT 
AT CAT T T G C T GAC CAT AAG AAT G AGAAG GGT C T T TAT AAT AT T CAT T TAT 
AC T AC C AAGAAG C T AGT G G GAC AC T T G T AG GT GT AAC AGG AAC T AAAGT G 
AC AGT AG C T G GAAC T AAT T C T T CT C AAG AAC C T AT T G AAAAT GGT T TAG C 
AAAG AC T GGT GT T TAT AAT AT TAT C G GAAG T AC T GAAGT AAAAAAT GAAG 
C T AAAAT AT C AAG T C AG AC C C AAT T T AC T T T AGAAAAAG GT G AC AAAAT A 
AAT TAT GAT C AAGT AT T GAC AG C AG AT GGT T AC C AG T GGAT T T CT T AC AA 

ATCTTATAGTGGTGTTCGTCGCTATATTCCTGTGAAAAAGCTAACTACAA 
G T AGT G AAAAAG C G AAAG AT GAG G C GAC T AAAC C G AC T AGT TAT C C CAAC 
T T AC CT AAAAC AGGT AC C TAT AC AT T T AC T AAAAC T GT AG AT GT GAAG AG 
T CAAC C T AAAGT AT C AAG T C C AG T GG AAT T T AAT T T T C AAAAG GGT G AAA 
AAAT ACATTATGAT C AAGT G T TAG T AGT AGAT GGT CAT C AGT G GAT T T C A 
T AC AAG AGT TAT T C C GG T AT T CGT C G C TAT AT T GAAAT T 

SEQ ID NO. 8912 

STRAIN 2 603 frame: 1 

MKKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNS 

SNETSASSVITSNNDSVQASDKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKE 

TEVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGS 
ETKAPT PVTN S GSNNQEKIATQGN YT FSHKVE VKNE AKVAS PTQFTLDKGDRI FYDQ I LT 

IEGNQWLSYKSFNGVRRFVLLGKASSVEKTEDKEKVSPQPQARITKTGRLTISNETTTGF 

DILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYN 

IHLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLAKTGVYNIIGSTEVKNEAKISSQT 

QFTLEKGDKINYDQVLTADGYQWISYKSYSGVRRYIPVKKLTTSSEKAKDEATKPTSYPN 

LPKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLVVDGHQWISYKSYSGIRRY 
IEI 

SEQ ID NO. 8913 

STRAIN 090 frame: 1 

KKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKVVNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVAS PTQFTLDKGDRI FYDQILT I 
EGNQWLS YKS FNGVRRFVLLGKAS S VEKTE DKEKVS PQPQARITKTGRLT I SNETTTG FD 

ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 

HLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLAKTGVYNIIGSTEVKNEAKISSQTQ 

FTLEKGDKINYDQVLTADGYQWISYKSYSGVRRYIPVKKLTTSSEKAKDEATKPTSYPNL 

PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLVVDGHQWISYKSYSGIRRYI 
EI 
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SEQ ID NO. 8914 

STRAIN A90 9 frame: 1 

KKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKVVNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
E VKNT P S KS A P VAF YAKKG DKV F Y DQ V FNK DN VKW I S YKS FCG VRR Y AA IESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLS YKS FNGVRRFVLLGKAS S VEKTEDKEKVS PQPQARITKTGRLT I SNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HL Y YQE AS GT LVGVT GTKVT VAGTN S S QE P I ENGLAKTG V YN 1 I GS TE VKNEAKI S S QT Q 
FTLEKGDKINYDQVLTADGYQWI S YKS YS GVRRYI PVKKLTT S SEKAKDEATKPT S YPNL 
PKT GT YT FTKTVDVKS Q PKVS S PVE FN FQKGEKIHYDQVLWDGHQW I S YKS YS G I RRYI 
EI 

SEQ ID NO. 8915 

STRAIN H3 6B frame: 1 

KKGQWDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDDNNSS 
NETSASSVITSNNDSVQASDKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLS YKS FNGVRRFVLLGKAS SVEKTE DKEKVS PQPQARITKTGRLT I SNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLAKTGVYNI I GSTE VKNEAKI SSQTQ 
FTLEKGDKINYDQVLTADGYQWI SYKSYSGVRRYIPVKKLTTSSEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLWDGHQWISYKSYSGIRRYI 
EI 

SEQ ID NO. 8916 

STRAIN 18RS21 frame: 1 

KKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKVVNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLS YKS FNGVRRFVLLGKAS SVEKTE DKEKVS PQPQARITKTGRLT I SNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEAS GTLVGVTGTKVT VAGTN S S QE P IENGLAKTGVYNI I GSTEVKNEAKI S SQTQ 
FTLEKGDKINYDQVLTADGYQWI S YKS YSGVRRYIPVKKLTTSSEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLWDGHQWISYKSYSGIRRYI 
EI 

SEQ ID NO. 8917 

STRAIN M732 frame: 1 

QVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSSNET 
SASSVITSNNDSVQAS DKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKETEVK 
NT P S KS AP VAF YAKKG DKV FYDQVFNKDN VBCW IS YKS FGGVRRYAAIESLDPSGGSETKA 
PTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTIEGN 
QWLSYKS FNGVRRFVLLGKAS SVEKTE DKEKVS PQPQARITKTGRLT I SNETTTGFDILI 
TNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNIHLY 
YQEASGTLVGVTGTKVTVAGTNSSQEPIENGLPKTGVYNIIGSTEVKNEAKISSQTQFTL 
EKGDKINYDQVLTADGYQWISYKSYSGVRRYIPVKKLTTSSEKAKDEATKPTSYPNLPKT 
GTYT FTKTVDVKS QPKVSS PVE FNFQKGEKIHYDQVLWDGHQWIS YKS YSGIRRYIEI 

SEQ ID NO. 8918 

STRAIN COH1 frame: 1 

KKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQAS DKVVNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFGGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLS YKS FNGVRRFVLLGKAS SVEKTEDKEKVS PQPQARITKTGRLT I SNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLPKTGVYNIIGSTEVKNEAKI SSQTQ 
FTLEKGDKINYDQVLTADGYQWI S YKS YSGVRRYIPVKKLTTSSEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLVVDGHQWISYKSYSGIRRYI 
EI 
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SEQ ID NO. 8919 

STRAIN M7 81 frame: 1 

KKGQWDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKVVNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFGGVRRYAAIESLDPSGGSE 
TKAPT&VTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLSYKSFNGVRRFVLLGKASSVEKTEDKEKVSPQPQARITKTGRLTISNETTTGFD 
ILITNIKDDNGIAAVKVPWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEASGTLVGVTGTKVT VAGTNS SQEPIENGLPKTGVYNI IGSTEVKNEAKI S SQTQ 
FTLEKGDKIN YDQVLTADGYQWI S YKS YSGVRRYI PVKKLTTS SEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLWDGHQWISYKSYSGIRRYI 
EI 

SEQ ID NO. 8920 

STRAIN CJB110 frame: 1 

KKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLSYKSFNGVRRFVLLGKASSVEKTEDKEKVSPQPQARITKTGRLTISNETTTGFD 
ILITNIKDDNGIAAVKVPWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLAKTGVYNI IGSTEVKNEAKI S SQTQ 
FT LEKGDKINYDQVLTADGYQWISYKS YSGVRRYI PVKKLTTSSEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLWDGHQWISYKSYSGIRRYI 
EI 

SEQ ID NO. 8921 

STRAIN 1169NT frame: 1 

KKGQVN DTKQS YS LRKYKFGLAS VI LGS FIMVT S PVFADQTT S VQVNNQTGT S VDANN S S 
NETSASSVITSNNDSVQAS DKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFGGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLSYKSFNGVRRFVLLGKAS SVEKTEDKEKVSPQPQARITKTGRLTISNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLAKTGVYNIIGSTEVKNEAKISSQTQ 
FTLEKGDKINYDQVLTADGYQWIS YKS YSGVRRYI PVKKLTTSSEBCAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLVVDGHQWISYKSYSGIRRYI 
EI 

SEQ ID NO. 8922 

STRAIN JM9130013 frame: 1 

KKGQVN DTKQS YS LRKYKFGLAS VI LGS FIMVT S PVFADQTT S VQVNNQTGT SVDANNSS 
NETSASSVITSNNDSVQASDKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLSYKSFNGVRRFVLLGKASSVEKTEDKEKVSPQPQARITKTGRLTIYNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HL Y YQE AS GT LVGVT GTKVT VAGTN S S QE P I ENGLAKT G V YN 1 1 G ST E VKNE AKI S S QT Q 
FTLEKGDKINYDQVLTADGYQWISYKSYSGVRRYI PVKKLTTS SEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLWDGHQWISYKSYSGIRRYI 
EI 
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SAG0001 


453 


chromosomal replication initiator protein DnaA 


SAG0002 


378 


DNA polymerase III, beta subunit 


SAG0003 


293 


diacylglycerol kinase catalytic domain protein, putative 


SAG0004 


65 


conserved hypothetical protein 


SAG0005 


67 


hypothetical protein 


SAG0006 


371 


GTP-binding protein YchF 


SAG0007 


191 


peptidyl-tRNA hydrolase 


SAG0008 


1165 


transcription-repair coupling factor 


SAG0009 


31 


hypothetical protein 


SAGOO 10 


90 


S4 domain protein 


SAGOO 11 


123 


cell division protein DivIC, putative 


SAGOO 12 


44 


conserved hypothetical protein 


SAGOO 13 


428 


protein of unknown function 


SAGOO 14 


424 


MesJ/Y cf62 family protein 


SAGOO 15 


180 


hypoxanthine-guaninephosphoribosyltransferase 


SAGOO 16 


658 


cell division protein FtsH 


SAGOO 17 


447 


pcsB protein 


SAGOO 18 


322 


ribose-phosphate pyrophosphokinase 


SAGOO 19 


391 


aminotransferase, class I 


SAG0020 


253 


recombination protein O 


SAG0021 


283 


protease, putative 


SAG0022 


330 


fatty acid/phospholipid synthesis protein PlsX 


SAG0023 


79 


acyl carrier protein 


SAG0024 


234 


phosphoribosylaminoimidazole-succinocarboxamide synthase 


SAG0025 


1241 


phosphoribosylformylglycinamidine synthase, putative 


SAG0026 


484 


amidophosphoribosyltransferase 


SAG0027 


340 


phosphoribosylformylglycinamidine cyclo-ligase 


SAG0028 


182 


phosphoribosylglycinamide formyltransferase 


SAG0029 


250 


acetyltransferase, GNAT family 


SAG0030 


515 


phosphoribosylaminoimidazolecarboxamide 
formyltransferase/IMP cyclohydrolase 


SAG0031 


299 


peptidase, M23/M37 family 


SAG0032 


434 


group B streptococcal surface immunogenic protein 


SAG0033 


232 


N-acetylmannosamine-6-P epimerase, putative 


SAG0034 


438 


sugar ABC transporter, sugar-binding protein 


SAG0035 


295 


sugar ABC transporter, permease protein 


SAG0036 


276 


sugar ABC transporter, permease protein 


SAG0037 


147 


conserved hypothetical protein 


SAG0038 


220 


conserved hypothetical protein 


SAG0039 


305 


N-acetylneuraminate lyase, putative 


SAG0040 


293 


ROK family protein 


SAG0041 


325 


acetyl xylan esterase, putative 


SAG0042 


267 


phosphosugar-binding transcriptional regulator, RpiR family, 
putative 


SAG0043 


421 


phosphoribosylamine— glycine ligase 


SAG0044 


162 


phosphoribosylaminoimidazole carboxylase, catalytic subunit 


SAG0045 


363 


phosphoribosylaminoimidazole carboxylase, ATPase subunit 


SAG0046 


463 


membrane protein, putative 
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SAG0047 


432 


adenylosuccinate lyase 


SAG0048 


303 


transcriptional regulator, Cro/CI family 


SAG0049 


332 


Holliday junction DNA helicase RuvB 


SAG0050 


145 


phosphotyrosine protein phosphatase, low molecular weight 


SAG0051 


126 


MORN motif family protein 


SAG0052 


592 


membrane protein, putative 


SAG0053 


880 


aldehyde-alcohol dehydrogenase 


SAG0054 


338 


alcohol dehydrogenase, propanol-preferring 


SAG0055 


496 


threonine synthase 


SAG0056 


412 


MATE efflux family protein 


SAG0057 


102 


ribosomal protein S10 


SAG0058 


208 


ribosomal protein L3 


SAG0059 


207 


ribosomal protein L4 


SAG0060 


98 


ribosomal protein L23 


SAG0061 


277 


ribosomal protein L2 


SAG0062 


92 


ribosomal protein S19 


SAG0063 


114 


ribosomal protein L22 


SAG0064 


217 


ribosomal protein S3 


SAG0065 


137 


ribosomal protein L16 


SAG0066 


68 


ribosomal protein L29 


SAG0067 


86 


ribosomal protein S17 


SAG0068 


122 


ribosomal protein L14 


SAG0069 


101 


ribosomal protein L24 


SAG0070 


180 


ribosomal protein L5 


SAG0071 


61 


ribosomal protein SI 4, putative 


SAG0072 


132 


ribosomal protein S8 


SAG0073 


178 


ribosomal protein L6 


SAG0074 


118 


ribosomal protein LI 8 


SAG0075 


164 


ribosomal protein S5 


SAG0076 


59 


ribosomal protein L30 


SAG0077 


146 


ribosomal protein LI 5 


SAG0078 


434 


preprotein translocase, SecY subunit 


SAG0079 


212 


adenylate kinase 


SAG0080 


72 


translation initiation factor EF-1 


SAG0081 


38 


ribosomal protein L36 


SAG0082 


121 


ribosomal protein S13 


SAG0083 


118 


ribosomal protein SI 1 


SAG0084 


312 


DNA-directed RNA polymerase, alpha subunit 


SAG0085 


128 


ribosomal protein LI 7 


SAG0086 


85 


lipoprotein, putative 


SAG0087 


59 


hypothetical protein 


SAG0088 


56 


hypothetical protein 


SAG0089 


183 


conserved hypothetical protein 


SAG0090 


139 


conserved hypothetical protein 


SAG0091 


144 


transcriptional regulator ComXl, putative 


SAG0092 


230 


phosphoglycerate mutase family protein 


SAG0093 


250 


D-alanyl-D-alanine carboxypeptidase family protein 


SAG0094 


191 


N-acetylmuramoyl-L-alanine amidase, family 4 protein 



400 



WO 2004/018646 



PCT/US2003/026827 



Table 1 : Complete list of GBS predicted genes 



ORF 


Size 
(a.a.) 


Annotation 


SAG0095 


344 


heat-inducible transcription repressor HrcA 


SAG0096 


190 


heat shock protein GrpE 


SAG0097 


609 


dnaK protein 


SAG0098 


379 


dnaJ protein 


SAG0099 


415 


transcriptional regulator, GntR family 


SAGO 100 


258 


tRNA pseudouridine synthase A 


SAG0101 


252 


phosphomethylpyrimidine kinase, putative 


SAGO 102 


154 


conserved hypothetical protein 


SAGO 103 


189 


conserved hypothetical protein TIGR01440 


SAGO 104 


280 


conserved hypothetical protein 


SAGO 105 


427 


trigger factor 


SAGO 106 


191 


DNA-directed RNA polymerase, delta subunit, putative 


SAGO 107 


534 


CTP synthase 


SAG0108 


308 


conserved hypothetical protein 


SAGO 109 


148 


deoxyuridine 5' -triphosphate nucleotidohydrolase 


SAGO 110 


454 


DNA repair protein RadA 


SAG0111 


165 


carbonic anhydrase-related protein 


SAGO 112 


439 


pyridine nucleotide-disulphide oxidoreductase family protein 


SAG0113 


484 


glutamyl-tRNA synthetase 


SAGO 114 


322 


ribose ABC transporter, periplasmic D-ribose-binding protein 


SAG0115 


310 


ribose ABC transporter, permease protein 


SAG0116 


492 


ribose ABC transporter, ATP-binding protein 


SAG0117 


132 


ribose ABC transporter protein RbsD 


SAG0118 


303 


ribokinase 


SAG0119 


328 


ribose operon repressor RbsR 


SAGO 120 


32 


hypothetical protein 


SAG0121 


362 


permease, putative 


SAGO 122 


228 


ABC transporter, ATP-binding protein 


SAGO 123 


223 


DNA-binding response regulator 


SAGO 124 


356 


sensor histidine kinase 


SAGO 125 


396 


argininosuccinate synthase 


SAGO 126 


462 


argininosuccinate lyase 


SAGO 127 


293 


fructose-bisphosphate aldolase 


SAGO 128 


305 


L-2-hydroxyisocaproate dehydrogenase 


SAGO 129 


62 


ribosomal protein L28 


SAG0130 


121 


conserved hypothetical protein 


SAG0131 


543 


DAK2 domain protein 


SAG0132 


294 


SPFH domain/Band 7 family protein 


SAG0133 


38 


conserved hypothetical protein 


SAGO 134 


96 


hypothetical protein 


SAG0135 


246 


amino acid ABC transporter, ATP-binding protein 


SAGO 136 


516 


amino acid ABC transporter, amino acid-binding protein/permease 
protein 


SAG0.137 


627 


conserved hypothetical protein 


SAGO 138 


279 


undecaprenol kinase, putative 


SAGO 139 


251 


negative regulator of competence MecA, putative 


SAGO 140 


386 


glycosyl transferase, group 4 family protein 


SAG0141 


256 


ABC transporter, ATP-binding protein 
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SAGO 142 


420 


conserved hypothetical protein 


SAG0143 


410 


selenocysteine lyase 


SAGO 144 


147 


NifU family protein 


SAGO 145 


472 


conserved hypothetical protein 


SAGO 146 


395 


penicillin-binding protein 4, putative 


SAGO 147 


411 


D-alanyl-D-alanine carboxypeptidase family protein 


SAGO 148 


551 


oligopeptide ABC transporter, substrate-binding protein, putative 


SAGO 149 


304 


oligopeptide ABC transporter, permease protein 


SAGO 150 


343 


oligopeptide ABC transporter, permease protein 


SAG0151 


348 


oligopeptide ABC transporter, ATP-binding protein 


SAGO 152 


310 


oligopeptide ABC transporter, ATP-binding protein 


SAGO 153 


283 


4-diphosphocytidyl-2C-methyl-D-erythritol kinase 


SAGO 154 


147 


adc operon repressor AdcR 


SAG0155 


236 


zinc ABC transporter, ATP-binding protein 


SAG0156 


270 


zinc ABC transporter, permease protein 


SAGO 157 


NA 


deoxyribonuclease-related protein, degenerate 


SAGO 158 


419 


tyrosyl-tRNA synthetase 


SAGO 159 


765 


penicillin-binding protein IB, putative 


SAGO 160 


1191 


DNA-directed RNA polymerase, beta subunit 


SAG0161 


1216 


DNA-directed RNA polymerase beta' subunit 


SAGO 162 


121 


conserved hypothetical protein 


SAGO 163 


323 


competence protein CglA 


SAGO 164 


282 


competence protein CglB 


SAGO 165 


151 


conserved hypothetical protein 


SAGO 166 


123 


conserved domain protein 


SAGO 167 


324 


conserved hypothetical protein 


SAGO 168 


397 


acetate kinase 


SAGO 169 


68 


transcriptional regulator, Cro/CI family 


SAGO 170 


45 


hypothetical protein 


SAG0171 


151 


hypothetical protein 


SAGO 172 


221 


protease, putative 


SAGO 173 


256 


pyrroline-5-carboxylate reductase 


SAGO 174 


355 


glutamyl-aminopeptidase 


SAGO 175 


79 


hypothetical protein 


SAGO 176 


94 


conserved hypothetical protein 


SAGO 177 


107 


thioredoxin family protein 


SAGO 178 


208 


tRNA binding domain protein 


SAG0179 


238 


conserved hypothetical protein 


SAGO 180 


131 


single-strand binding protein 


SAG0181 


214 


hydrolase, haloacid dehalogenase-like family 


SAGO 182 


581 


sensor histidine kinase, putative 


SAGO 183 


246 


response regulator 


SAGO 184 


151 


conserved hypothetical protein 


SAG0185 


242 


membrane protein, putative 


SAGO 186 


36 


hypothetical protein 


SAGO 187 


542 


oligopeptide ABC transporter, oligopeptide-binding protein 


SAGO 188 


325 


oligopeptide ABC transporter, permease protein 


SAGO 189 


273 


oligopeptide ABC transporter, permease protein 
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SAGO 190 


267 


peptide ABC transporter, ATP-binding protein 


SAG0191 


208 


peptide ABC transporter, ATP-binding protein 


SAGO 192 


676 


PTS system, IIABC components 


SAGO 193 


541 


alpha amylase family protein 


SAGO 194 


639 


transcriptional antiterminator, BglG family 


SAGO 195 


377 


IS 1548, transposase 


SAGO 196 


66 


conserved domain protein 


SAGO 197 


94 


PTS system, IIB component, putative 


SAGO 198 


451 


PTS system, IIC component, putative 


SAG0199 


285 


transketolase, N-terminal subunit 


SAG0200 


309 


transketolase, C-terminal subunit 


SAG0201 


419 


oxidoreductase, putative 


SAG0202 


89 


ribosomal protein S 1 5 


SAG0203 


709 


polyribonucleotide nucleotidyltransferase 


SAG0204 


250 


conserved hypothetical protein 


SAG0205 


194 


serine O-acetyltransferase 


SAG0206 


60 


lipoprotein, putative 


SAG0207 


447 


cysteinyl-tRNA synthetase 


SAG0208 


128 


conserved hypothetical protein 


SAG0209 


251 


RNA methyltransferase, TrmH family, group 3 


SAG0210 


172 


conserved hypothetical protein 


SAG0211 


286 


DegV family protein 


SAG0212 


32 


hypothetical protein 


SAG0213 


39 


hypothetical protein 


SAG0214 


148 


ribosomal protein LI 3 


SAG0215 


130 


ribosomal protein S9 


SAG0216 


33 


hypothetical protein 


SAG0217 


384 


site-specific recombinase, phage integrase family 


SAG0218 


158 


transcriptional regulator, Cro/CI family 


SAG0219 


101 


hypothetical protein 


SAG0220 


92 


conserved hypothetical protein 


SAG0221 


76 


hypothetical protein 


SAG0222 


108 


conserved domain protein 


SAG0223 


209 


conserved hypothetical protein, fusion 


SAG0224 


332 


replication initiation protein, putative 


SAG0225 


144 


hypothetical protein 


SAG0226 


418 


recombination protein 


SAG0227 


156 


hypothetical protein 


SAG0228 


111 


conserved hypothetical protein 


SAG0229 


95 


conserved hypothetical protein 


SAG0230 


96 


conserved hypothetical protein 


SAG0231 


135 


hypothetical protein 


SAG0232 


186 


hypothetical protein 


SAG0233 


226 


hypothetical protein 


SAG0234 


128 


hypothetical protein 


SAG0235 


93 


hypothetical protein 


SAG0236 


32 


hypothetical protein 


SAG0237 


34 


hypothetical protein 
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SAG0238 


41 


hypothetical protein 


SAG0239 


286 


transcriptional regulator MutR family 


SAG0240 


393 


transporter, putative 


SAG0241 


213 


amino acid ABC transporter, permease protein 


SAG0242 


308 


amino acid ABC transporter, amino acid-binding protein 


SAG0243 


211 


amino acid ABC transporter, permease protein 


SAG0244 


381 


amino acid ABC transporter, ATP-binding protein 


SAG0245 


152 


protein of unknown function/lipoprotein, putative 


SAG0246 


268 


hypothetical protein 


SAG0247 


116 


hypothetical protein 


SAG0248 


90 


hypothetical protein 


SAG0249 


116 


hypothetical protein 


SAG0250 


193 


membrane protein, putative 


SAG0251 


72 


transcriptional regulator, Cro/CI family 


SAG0252 


186 


acetyltransferase, GNAT family 


SAG0253 


192 


acetyltransferase, GNAT family 


SAG0254 


226 


acetyltransferase, GNAT family 


SAG0255 


315 


conserved hypothetical protein 


SAG0256 


163 


RNA polymerase sigma factor, ECF subfamily 


SAG0257 


53 


lipoprotein, putative 


SAG0258 


202 


transcriptional regulator, TetR family 


SAG0259 


365 


ABC transporter efflux protein, DrrB family, putative 


SAG0260 


238 


ABC transporter, ATP-binding protein 


SAG0261 


129 


IS 1381, transposase OrfB 


SAG0262 


127 


IS 1381, transposase OrfA 


SAG0263 


171 


hypothetical protein 


SAG0264 


103 


conserved hypothetical protein 


SAG0265 


235 


conserved hypothetical protein 


SAG0266 


382 


N-acetylglucosamine-6-phosphate deacetylase 


SAG0267 


180 


conserved hypothetical protein 


SAG0268 


304 


glycyl-tRNA synthetase, alpha subunit 


SAG0269 


213 


acyl carrier protein phosphodiesterase, putative 


SAG0270 


679 


glycyl-tRNA synthetase, beta subunit 


SAG0271 


85 


conserved hypothetical protein 


SAG0272 


87 


membrane protein, putative 


SAG0273 


502 


glycerol kinase 


SAG0274 


609 


alpha-glycerophosphate oxidase 


SAG0275 


232 


glycerol uptake facilitator protein 


SAG0276 


445 


NADH oxidase, putative 


SAG0277 


476 


conserved hypothetical protein 


SAG0278 


661 


transketolase 


SAG0279 


101 


conserved hypothetical protein 


SAG0280 


244 


ABC transporter, ATP-binding protein 


SAG0281 


534 


membrane protein, putative 


SAG0282 


461 


PTS system, IIBC components 


SAG0283 


267 


glutamate 5-kinase 


SAG0284 


417 


gamma-glutamyl phosphate reductase 


SAG0285 


298 


conserved hypothetical protein TIGR00006 
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SAG0286 


" 108 


cell division protein FtsL, putative 


SAG0287 


752 


penicillin-binding protein 2X 


SAG0288 


336 


phospho-N-acetylmuramoyl-pentapeptide-transferase 


SAG0289 


447 


ATP-dependent RNA helicase, DEAD/DEAH box family 


SAG0290 


270 


ABC transporter, substrate-binding protein 


SAG0291 


267 


amino acid ABC transporter, permease protein 


SAG0292 


247 


amino acid ABC transporter, ATP-binding protein 


SAG0293 


74 


conserved hypothetical protein 


SAG0294 


304 


thioredoxin reductase 


SAG0295 


486 


conserved hypothetical protein 


SAG0296 


273 


NAD synthetase 


SAG0297 


444 


aminopeptidase C 


SAG0298 


750 


penicillin-binding protein 1A 


SAG0299 


199 


recombination protein U 


SAG0300 


172 


conserved hypothetical protein 


SAG0301 


40 


hypothetical protein 


SAG0302 


110 


conserved hypothetical protein 


SAG0303 


384 


conserved hypothetical protein 


SAG0304 


487 


conserved hypothetical protein 


SAG0305 


160 


autoinducer-2 production protein LuxS 


SAG0306 


535 


KH domain protein 


SAG0307 


33 


hypothetical protein 


SAG0308 


298 


ABC transporter, ATP-binding protein 


SAG0309 


246 


ABC transporter, permease protein, putative 


SAG0310 


361 


conserved hypothetical protein 


SAG0311 


NA 


DNA-binding response regulator, authentic point mutation 


SAG0312 


234 


conserved hypothetical protein 


SAG0313 


209 


guanylate kinase 


SAG0314 


104 


DNA-directed RNA polymerase, omega subunit, putative 


SAG0315 


796 


primosomal protein N' 


SAG0316 


311 


methionyl-tRNA formyltransferase 


SAG0317 


440 


sun protein 


SAG0318 


245 


serine/threonine phosphatase, putative 


SAG0319 


651 


serine/threonine protein kinase 


SAG0320 


231 


conserved hypothetical protein 


SAG0321 


339 


sensor histidine kinase, putative 


SAG0322 


213 


DNA-binding response regulator 


SAG0323 


466 


hydrolase, haloacid dehalogenase family/peptidyl-prolyl cis-trans 
isomerase, cyclophilin type 


SAG0324 


124 


general stress protein, putative 


SAG0325 


258 


pyruvate formate-lyase-activating enzyme 


SAG0326 


251 


transcriptional regulator, DeoR family 


SAG0327 


327 


transcriptional regulator, putative 


SAG0328 


107 


PTS system, cellobiose-specific IIA component 


SAG0329 


106 


PTS system, cellobiose-specific IIB component 


SAG0330 


433 


PTS system, cellobiose-specific IIC component 


SAG0331 


818 


formate acetyltransferase 


SAG0332 


222 


transaldolase family protein 
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SAG0333 


362 


glycerol dehydrogenase 


SAG0334 


308 


cysteine synthase A 


SAG0335 


214 


conserved hypothetical protein TIGR00257 


SAG0336 


429 


helicase, putative 


SAG0337 


221 


competence protein F, putative 


SAG0338 


184 


ribosomal subunit interface protein 


SAG0339 


450 


aspartate kinase family protein 


SAG0340 


216 


hydrolase, haloacid dehalogenase-like family 


SAG0341 


49 


hypothetical protein 


SAG0342 


263 


enoyl-CoA hydratase/isomerase family protein 


SAG0343 


144 


transcriptional regulator, MarR family 


SAG0344 


323 


3-oxoacyl-(acyl-carrier-protein) synthase III 


SAG0345 


74 


acyl carrier protein 


SAG0346 


319 


enoyl-(acyl-carrier-protein) reductase II 


SAG0347 


308 


malonyl Co A-acyl carrier protein transacylase 


SAG0348 


244 


3 -oxoacyl- [acy 1-c arrier protein] reductase 


SAG0349 


410 


3-oxoacyl-(acyl-carrier-protein) synthase II 


SAG035O 


166 


acetyl-CoA carboxylase, biotin carboxyl carrier protein 


SAG0351 


140 


(3R)-hydroxymyristoyl-(acyl-carrier-protein) dehydratase 


SAG0352 


456 


acetyl-CoA carboxylase, biotin carboxylase 


SAG0353 


291 


acetyl-CoA carboxylase, carboxyl transferase, beta subunit 


SAG0354 


257 


acetyl-CoA carboxylase, carboxyl transferase, alpha subunit 


SAG0355 


210 


conserved hypothetical protein 


SAG0356 


425 


seryl-tRNA synthetase 


SAG0357 


330 


membrane protein, putative 


SAG0358 


120 


conserved hypothetical protein 


SAG0359 


303 


PTS system, mannose-specific IID component 


SAG0360 


270 


PTS system, mannose-specific IIC component 


SAG0361 


336 


PTS system, mannose-specific IIAB components 


SAG0362 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0363 


194 


hypothetical protein 


SAG0364 


203 


membrane protein, putative 


SAG0365 


473 


xanthine/uracil permease family protein 


SAG03 66 


169 


conserved hypothetical protein TIGR00150 


SAG0367 


186 


acetyltransferase, GNAT family 


SAG0368 


435 


protein of unknown function 


SAG03 69 


98 


conserved hypothetical protein 


SAG0370 


139 


HIT family protein 


SAG0371 


167 


hypothetical protein 


SAG0372 


85 


hypothetical protein 


SAG0373 


241 


ABC transporter, ATP-bmdmg protein 


SAG0374 


344 


ABC transporter, permease protein 


SAG0375 


266 


conserved hypothetical protein 


SAG0376 


211 


conserved hypothetical protein TIGR00091 


SAG0377 


127 


conserved hypothetical protein 


SAG0378 


379 


N utilization substance protein A 


SAG0379 


98 


conserved hypothetical protein 


SAG0380 


100 


ribosomal protein L7A family 
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SAG0381 


927 


translation initiation factor IF-2 


SAG0382 


122 


ribosome-binding factor A 


SAG0383 


334 


protein of unknown function/lipoprotein, putative 


SAG0384 


138 


transcriptional repressor CopY 


SAG0385 


744 


copper-transporter ATPase CopA 


SAG0386 


68 


copper-transporter protein CopZ 


SAG0387 


204 


membrane protein, putative 


SAG0388 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0389 


880 


DNA polymerase I 


SAG0390 


146 


CoA-binding domain protein 


SAG0391 


159 


transcriptional regulator, Fur family 


SAG0392 


521 


cell wall surface anchor family protein 


SAG0393 


228 


DNA-binding response regulator 


SAG0394 


345 


sensor histidine kinase 


SAG0395 


246 


membrane protein, putative 


SAG0396 


380 


queuine tRNA-ribosyltransferase 


SAG0397 


102 


conserved hypothetiqal protein 


SAG0398 


179 


BioY family protein 


SAG0399 


258 


AtsA/ElaC family protein 


SAG0400 


168 


cytidine/deoxycytidylate deaminase family protein 


SAG0401 


44 


hypothetical protein 


SAG0402 


449 


glucose-6-phosphate isomerase 


SAG0403 


175 


5-formyltetrahydrofolate cyclo-ligase family protein 


SAG0404 


225 


rhomboid family protein 


SAG0405 


347 


protein of unknown function/lipoprotein, putative 


SAG0406 


299 


UTP-glucose- 1 -phosphate uridylyltransferase 


SAG0407 


338 


glycerol-3 -phosphate dehydrogenase (NAD(P)+) 


SAG0408 


109 


ribonuclease P protein component 


SAG0409 


271 


SpoIIIJ family protein 


SAG0410 


273 


R3H domain protein 


SAG0411 


177 


conserved hypothetical protein 


SAG0412 


258 


recX protein 


SAG0413 


451 


RNA methyltransferase, TrmA family 


SAG0414 


153 


conserved hypothetical protein 


SAG0415 


142 


acetyltransferase, GNAT family 


SAG0416 


1233 


protease, putative 


SAG0417 


302 


glycosyl transferase, group 2 family protein 


SAG0418 


336 


ribonucleoside-diphosphate reductase 2, beta subunit 


SAG0419 


137 


nrdl protein 


SAG0420 


721 


ribonucleoside-diphosphate reductase 2, alpha subunit 


SAG0421 


1055 


cell wall surface anchor family protein 


SAG0422 


129 


conserved hypothetical protein 


SAG0423 


132 


conserved domain protein 


SAG0424 


94 


hypothetical protein 


SAG0425 


105 


carboxymuconolactone decarboxylase family protein 


SAG0426 


131 


conserved hypothetical protein 


SAG0427 


129 


transcriptional regulator, MerR family 


SAG0428 


345 


alcohol dehydrogenase, zinc-containing 
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SAG0429 


284 


oxidoreductase, aldo/keto reductase family 


SAG0430 


287 


cation efflux system protein 


SAG0431 


174 


transcriptional regulator, TetR family 


SAG0432 


397 


transcriptional regulator, AraC family 


SAG0433 


1389 


surface protein Rib 


SAG0434 


61 


transposase, IS256 family, truncation 


SAG0435 


97 


DNA-damage-inducible protein J, putative 


SAG0436 


62 


hypothetical protein 


SAG0437 


123 


lipoprotein, putative 


SAG0438 


145 


bacteriophage L54a, integrase, truncation 


SAG0439 


NA 


conserved hypothetical protein, degenerate 


SAG0440 


84 


conserved hypothetical protein 


SAG0441 


103 


conserved domain protein 


SAG0442 


189 


acetyltransferase, GNAT family 


SAG0443 


194 


acetyltransferase, GNAT family 


SAG0444 


188 


conserved hypothetical protein 


SAG0445 


883 


valyl-tRNA synthetase 


SAG0446 


319 


oxidoreductase, Gfo/Idh/MocA family 


SAG0447 


287 


magnesium transporter, CorA family 


SAG0448 


391 


transposase, IS256 family 


SAG0449 


354 


conserved hypothetical protein 


SAG0450 


330 


aspartate— ammonia ligase 


SAG0451 


149 


bacteriocin transport accessory protein, putative 


SAG0452 


179 


type II DNA modification methyltransferase, putative 


SAG0453 


96 


hypothetical protein 


SAG0454 


161 


phosphopantetheine adenylyltransferase 


SAG0455 


357 


conserved hypothetical protein 


SAG0456 


NA 


conserved hypothetical protein, degenerate 


SAG0457 


192 


conserved hypothetical protein 


SAG0458 


368 


conserved hypothetical protein TIGR00048 


SAG0459 


171 


VanZF domain protein 


SAG0460 


581 


ABC transporter, ATP-binding/permease protein 


SAG0461 


579 


ABC transporter, ATP-binding/permease protein 


SAG0462 


188 


anthranilate synthase component II 


SAG0463 


179 


BioY family protein 


SAG0464 


330 


biotin synthetase 


SAG0465 


164 


hypothetical protein 


SAG0466 


371 


thiolase 


SAG0467 


409 


AMP-binding enzyme domain protein 


SAG0468 


210 


endonuclease III 


SAG0469 


131 


type IV prepilin peptidase-related protein 


SAG0470 


69 


conserved hypothetical protein 


SAG0471 


322 


glucokinase 


SAG0472 


126 


rhodanese-like family protein 


SAG0473 


613 


elongation factor Tu family protein 


SAG0474 


81 


conserved hypothetical protein 


SAG0475 


451 


UDP-N-acetylmuramoylalanine— D-glutamate ligase 


SAG0476 


358 


UDP-N-acetylglucosamine— N-acetylmuramyl-(pentapeptide) 
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pyrophosphoryl-undecaprenol N-acetylglucosamine transferase 


SAG0477 


378 


cell division protein DivIB, putative 


SAG0478 


429 


cell division protein FtsA 


SAG0479 


426 


cell division protein FtsZ 


SAG0480 


224 


ylmE protein, putative 


SAG0481 


201 


ylmF protein 


SAG0482 


84 


YGGT family protein 


SAG0483 


262 


ylmH protein 


SAG0484 


256 


cell division protein DivIVA, putative 


SAG0485 


930 


isoleucyl-tRNA synthetase 


SAG0486 


100 


conserved hypothetical protein 


SAG0487 


151 


MutT/nudix family protein 


SAG0488 


753 


ATP-dependent Clp protease, ATP-binding subunit 


SAG0489 


34 


hypothetical protein 


SAG0490 


76 


conserved hypothetical protein 


SAG0491 


230 


amino acid ABC transporter, permease protein 


SAG0492 


244 


amino acid ABC transporter, ATP-binding protein 


SAG0493 


564 


phosphoglucomutase/phosphomannomutase family protein 


SAG0494 


284 


methylenetetrahydrofolate 

dehydrogenase/methenyltetrahydrofolate cyclohydrolase 


SAG0495 


278 


protein of unknown function 


SAG0496 


446 


exodeoxyribonuclease VII, large subunit 


SAG0497 


71 


exodeoxyribonuclease VII, small subunit 


SAG0498 


290 


geranyltranstransferase, putative 


SAG0499 


275 


hemolysin A 


SAG0500 


157 


arginine repressor ArgR, putative 


SAG0501 


552 


DNA repair protein RecN 


SAG0502 


278 


DegV family protein 


SAG0503 


279 


lipase/acylhydrolase 


SAG0504 


200 


conserved hypothetical protein 


SAG0505 


91 


DNA-binding protein HU 


SAG0506 


65 


hypothetical protein 


SAG0507 


310 


dihydroorotate dehydrogenase A 


SAG0508 


411 


beta-lactam resistance factor 


SAG0509 


403 


beta-lactam resistance factor 


SAG0510 


406 


murM protein, putative 


SAG0511 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0512 


438 


HD domain protein 


SAG0513 


128 


conserved hypothetical protein 


SAG0514 


894 


cation-transporting ATPase, E1-E2 family 


SAG0515 


286 


conserved hypothetical protein 


SAG0516 


643 


fructose-.l ,6-bisphosphatase, putative 


SAG0517 


374 


iron-sulfur cluster-binding protein, putative 


SAG0518 


NA 


peptide chain release factor 2, programmed frameshift 


SAG0519 


230 


cell division ABC transporter, ATP-binding protein FtsE 


SAG0520 


309 


cell division ABC transporter, permease protein FtsX 


SAG0521 


236 


carboxymethylenebutenolidase-related protein 


SAG0522 


232 


metallo-beta-lactamase superfamily protein 
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SAG0523 


254 


oxidoreductase, short chain dehydrogenase/reductase family 


SAG0524 


835 


DNA polymerase III, epsilon subunit/ATP-dependent helicase 
DinG 


SAG0525 


397 


aspartate aminotransferase 


SAG0526 


448 


asparaginyl-tRNA synthetase 


SAG0527 


185 


conserved hypothetical protein 


SAG0528 


327 


inosine-uridine preferring nucleoside hydrolase 


SAG0529 


38 


hypothetical protein 


SAG0530 


137 


OsmC/Ohr family protein 


SAG0531 


296 


conserved hypothetical protein 


SAG0532 


324 


conserved hypothetical protein 


SAG0533 


303 


conserved hypothetical protein 


SAG0534 


465 


dipeptidase 


SAG0535 


506 


zinc ABC transporter, zinc-binding adhesion liprotein 


SAG0536 


86 


ribosomal protein L3 1 


SAG0537 


311 


DHH family protein 


SAG0538 


340 


adenosine deaminase, putative 


SAG0539 


147 


flavodoxin 


SAG0540 


91 


chorismate mutase, putative 


SAG0541 


398 


voltage-gated chloride channel family protein 


SAG0542 


127 


IS 1381, transposase OrfA 


SAG0543 


129 


IS 1 3 8 1 , transposase OrfB 


SAG0544 


115 


ribosomal protein LI 9 


SAG0545 


359 


prophage LambdaSal , site-specific recombinase, phage integrase 
family 


SAG0546 


67 


conserved domain protein 


SAG0547 


185 


hypothetical protein 


SAG0548 


265 


prophage LambdaSal, repressor protein, putative 


SAG0549 


47 


hypothetical protein 


SAG0550 


74 


conserved hypothetical protein 


SAG0551 


52 


conserved hypothetical protein 


SAG0552 


62 


hypothetical protein 


SAG0553 


268 


hypothetical protein 


SAG0554 


63 


prophage LambdaSal, transcriptional regulator, Cro/CI family 


SAG0555 


249 


prophage LambdaSal, antirepressor, putative 


SAG0556 


47 


hypothetical protein 


SAG0557 


76 


hypothetical protein 


SAG0558 


74 


hypothetical protein 


SAG0559 


286 


conserved hypothetical protein 


SAG0560 


77 


conserved hypothetical protein 


SAG0561 


46 


hypothetical protein 


SAG0562 


84 


hypothetical protein 


SAG0563 


53 


hypothetical protein 


SAG0564 


160 


conserved hypothetical protein 


SAG0565 


224 


conserved domain protein 


SAG05 66 


138 


prophage LambdaSal, single-strand binding protein 


SAG0567 


439 


prophage LambdaSal, reverse transcriptase/maturase family 
protein 
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SAG0568 


67 


conserved hypothetical protein 


SAG0569 


158 


conserved hypothetical protein 


SAG0570 


115 


hypothetical protein 


SAG0571 


43 


hypothetical protein 


SAG0572 


138 


conserved hypothetical protein 


SAG0573 


54 


hypothetical protein 


SAG0574 


89 


conserved hypothetical protein 


SAG0575 


110 


hypothetical protein 


SAG0576 


43 


hypothetical protein 


SAG0577 


177 


conserved hypothetical protein 


SAG0578 


88 


conserved hypothetical protein 


SAG0579 


142 


conserved hypothetical protein 


SAG0580 


111 


conserved hypothetical protein, truncation 


SAG0581 


118 


conserved hypothetical protein 


SAG0582 


422 


conserved hypothetical protein 


SAG0583 


406 


conserved hypothetical protein 


SAG0584 


62 


conserved hypothetical protein, truncation 


SAG0585 


471 


conserved hypothetical protein 


SAG0586 


154 


conserved hypothetical protein 


SAG0587 


300 


prophage LambdaSal, structural protein, putative 


SAG0588 


71 


conserved hypothetical protein 


SAG0589 


143 


conserved hypothetical protein 


SAG0590 


112 


conserved hypothetical protein 


SAG0591 


78 


conserved hypothetical protein 


SAG0592 


111 


conserved hypothetical protein 


SAG0593 


185 


prophage LambdaSal, structural protein 


SAG0594 


81 


conserved hypothetical protein 


SAG0595 


123 


conserved hypothetical protein 


SAG0596 


670 


prophage LambdaSal, pbl A protein, internal deletion 


SAG0597 


506 


prophage LambdaSal, minor structural protein, putative 


SAG0598 


1374 


prophage LambdaSal, N-acetylmuramoyl-L-alanine amidase, 
family 4 


SAG0599 


668 


prophage LambdaSal, minor structural protein, putative 


SAG0600 


109 


hypothetical protein 


SAG0601 


70 


hypothetical protein 


SAG0602 


100 


conserved hypothetical protein 


SAG0603 


111 


conserved hypothetical protein 


SAG0604 


239 


prophage LambdaSal, lysin, putative 


SAG0605 


323 


conserved hypothetical protein 


SAG0606 


66 


conserved hypothetical protein 


SAG0607 


56 


conserved hypothetical protein 


SAG0608 


59 


hypothetical protein 


SAG0609 


NA 


prophage LambdaSal, integrase, degenerate 


SAG0610 


134 


conserved hypothetical protein 


SAG0611 


NA 


transposase, degenerate 


SAG0612 


53 


conserved hypothetical protein 


SAG0613 


425 


transmembrane protein Vexpl 


SAG0614 


218 


ABC transporter, ATP-binding protein Vexp2 
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SAG0615 


458 


transmembrane protein Vexp3 


SAG0616 


217 


DNA-binding response regulator VncR 


SAG0617 


439 


sensor histidine kinase VncS 


SAG0618 


195 


transposase OrfB, IS3 family, truncation 


SAG0619 


66 


conserved hypothetical protein 


SAG0620 


62 


hypothetical protein 


SAG0621 


401 


rod shape-determining protein RodA, putative □ 


SAG0622 


186 


hydrolase, haloacid dehalogenase-like familv 


SAG0623 


650 


DNA gyrase, B subunit 


SAG0624 


574 


septation ring formation regulator EzrA, putative 


SAG0625 


213 


phosphoserine phosphatase SerB 


SAG0626 


161 


MutT/nudix family protein 


SAG0627 


151 


conserved hypothetical protein 


SAG0628 


435 


enolase 


SAG0629 


354 


conserved domain protein 


SAG0630 


427 


3-phosphoshikimate 1 -carboxyvinyltransferase 


SAG0631 


170 


shikimate kinase 


SAG0632 


457 


psr protein 


SAG0633 


451 


RNA methyltransferase. TrmA family 


SAG0634 


70 


hypothetical protein 


SAG0635 


245 


acid phosphatase, class B 


SAG0636 


172 


conserved hypothetical protein 


SAG0637 


NA 


transcriptional regulator, TetR family, putative, authentic 
frameshift 


SAG0638 


109 


cell wall surface anchor family protein, truncation 


SAG0639 


273 


transposase OrfB, IS3 family 


SAG0640 


91 


transposase OrfA, IS 3 family 


SAG0641 


NA 


Tn5252, Orf 10 protein, degenerate 


SAG0642 


59 


hypothetical protein 


SAG0643 


NA 


chaperonin, 33 kDa, degenerate 


SAG0644 


402 


transcriptional regulator, AraC family 


SAG0645 


554 


cell wall surface anchor family protein 


SAG0646 


307 


cell wall surface anchor family protein 


SAG0647 


305 


sortase family protein 


SAG0648 


260 


sortase family protein 


SAG0649 


890 


cell wall surface anchor family protein, putative 


SAG0650 


189 


sortase family protein 


SAG0651 


. 201 


protein of unknown function 


SAG0652 


NA 


Tn5252, Orf 28 protein, degenerate 


SAG0653 


NA 


conserved hypothetical protein, degenerate 


SAG0654 


34 


hypothetical protein 


SAG0655 


57 


conserved hypothetical protein 


SAG0656 


36 


hypothetical protein 


SAG0657 


89 


hypothetical protein 


SAG0658 


383 


lipoprotein, putative 


SAG0659 


330 


ABC transporter, ATP-binding protein 


SAG0660 


272 


membrane protein 


SAG0661 


261 


conserved hypothetical protein 
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SAG0662 


101 


cylX protein 


SAG0663 


282 


cylD protein 


SAG0664 


240 


cylG protein 


SAG0665 


101 


acyl carrier protein AcpC 


SAG0666 


158 


cylZ protein 


SAG0667 


309 


cyl A protein 


SAG0668 


292 


cylB protein 


SAG0669 


667 


cylE protein 


SAG0670 


317 


cylF protein 

*J A 


SAG0671 


731 


cyll protein 

mt A 


SAG0672 


403 


cylJ protein 


SAG0673 


191 


cylK protein 


SAG0674 


113 


hypothetical protein 

•/a a i ■■ ■ liaill , „, 


SAG0675 


171 


putative secreted protein 


SAG0676 


885 


proteinase, putative 


SAG0677 


1062 


hypothetical protein 


SAG0678 


NA 


endopeptidase O, degenerate 


SAG0679 


343 


protein of unknown function 


SAG0680 


339 


protein of unknown function 


SAG0681 


353 


conserved domain protein 


SAG0682 


409 


permease, putative 


SAG0683 


NA 


transmembrane protein Vexp3, putative, degenerate 


SAG0684 


223 


ABC transporter, ATP-binding protein 


SAG0685 


472 


conserved hypothetical protein 


SAG0686 


261 


DNA-entry nuclease, putative 


SAG0687 


212 


DedA family protein, putative 

*/ A 'A 


SAG0688 


218 


ABC transporter, ATP-binding protein 


SAG0689 


257 


membrane protein, putative 


SAG0690 


272 


conserved hypothetical protein 

A A iuti 


SAG0691 


294 


transcriptional regulator, LysR family 


SAG0692 


193 


regulatory protein, putative 


SAG0693 


377 


IS1548, transposase 


SAG0694 


173 


regulatory protein, putative, truncation 


SAG0695 


330 


D-lactate dehydrogenase 


SAG0696 


516 


sodium: galactoside symporter family protein, putative 


SAG0697 


341 


2-keto-3 -deoxy gluconate kinase 


SAG0698 


599 


beta-glucuronidase 


SAG0699 


223 


transcriptional regulator, GntR family 


SAG0700 


205 


2-dehydro-3 -deoxyphosphogluconate aldolase/4 -hy droxy-2- 
oxoglutarate aldolase 


SAG0701 


466 


glucuronate isomerase 


SAG0702 


348 


mannonate dehydratase 


SAG0703 


279 


D-mannonate oxidoreductase 


SAG0704 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0705 


596 


glycosyl hydrolase, family 3 


SAG0706 


361 


proline dipeptidase 


SAG0707 


334 


transcriptional regulator, RegM family 


SAG0708 


488 


alpha amylase family protein 
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SAG0709 


332 


glycosyl transferase, group 1 family protein 


SAG0710 


444 


glycosyl transferase, group 1 family protein 


SAG0711 


647 


threonyl-tRNA synthetase 


SAG0712 


234 


DNA-binding response regulator 


SAG0713 


339 


conserved hypothetical protein 


SAG0714 


188 


conserved hypothetical protein 

~ J. JL 


SAG0715 


216 


amino acid ABC transporter, permease protein 


SAG0716 


231 


amino acid ABC transporter, permease protein 

— £_ U_ ±_ 


SAG0717 


266 


amino acid ABC transporter, amino acid-binding protein 


SAG0718 


251 


amino acid ABC transporter, ATP-binding protein. 


SAG0719 


236 


DNA-binding response regulator 


SAG0720 


449 


sensory box histidine kinase 


SAG0721 


269 


metallo-beta-lactamase superfamily protein 

£- rL — C 


SAG0722 


122 


conserved hypothetical protein 


SAG0723 


236 


ribonuclease III 


SAG0724 


1179 


chromosome segregation SMC protein 


SAG0725 


265 


hydrolase, haloacid dehalogenase-like family 


SAG0726 


274 


hydrolase, haloacid dehalogenase-like family 


SAG0727 


536 


signal recognition particle-docking protein FtsY 


SAG0728 


270 


ABC transporter, substrate-binding protein 


SAG0729 


300 


ABC transporter, permease protein, putative 


SAG0730 


42 


ABC transporter, ATP-binding protein 


SAG0731 


347 


bacterial luciferase family protein 


SAG0732 


720 


transcriptional accessory protein Tex, putative 

4- md X. * JL _ _ 


SAG0733 


142 


conserved hypothetical protein 


SAG0734 


87 


phage shock protein C, putative 


SAG0735 


44 


hypothetical protein 


SAG0736 


311 


HPr(Ser) kinase/phosphatase 

V w JL JL 


SAG0737 


257 


prolipoprotein diacylglyceryl transferase 


SAG0738 


132 


conserved hypothetical protein 

■» JL JL 


SAG0739 


143 


conserved hypothetical protein 


SAG0740 


91 


conserved hypothetical protein 


SAG0741 


303 


peptidase, U32 family, putative 


SAG0742 


428 


peptidase, U32 family 


SAG0743 


70 


conserved hypothetical protein 


SAG0744 


265 


membrane protein, putative 


SAG0745 


446 


Mn2+/Fe2+ transporter, NRAMP family 


SAG0746 


369 


riboflavin biosynthesis protein RibD 


SAG0747 


208 


riboflavin synthase, alpha subunit 


SAG0748 


397 


riboflavin biosynthesis protein RibA 


SAG0749 


156 


riboflavin synthase, beta subunit 


SAG0750 


496 


lysyl-tRNA synthetase 


SAG0751 


300 


hydrolase, haloacid dehalogenase-like family 


SAG0752 


213 


phosphoglycerate mutase family protein 


SAG0753 


157 


ebsC family protein, putative 


SAG0754 


205 


conserved domain protein 


SAG0755 


282 


peptidase, U32 family 


SAG0756 


174 


conserved hypothetical protein 



414 



WO 2004/018646 



PCT/US2003/026827 



Table 1 : Complete list of GBS predicted genes 



ORF 

-A. » m 


Size 
(a. a.) 


Annotation 


SAG0757 


129 


protein of unknown fiinction/lipoprotein, putative 


SAG0758 


599 


olieoendopeptidase F, putative 


SAG0759 


931 


phosphoenolpyruvate carboxvlase 


SAG0760 


377 


IS 1548, transposase 


SAG0761 


422 


cell division nrotein. FtsW/RodA/SnoVK familv 


SAG0762 


398 

^ *j 


translation elongation factor Tn 


SAG0763 


252 


trio sen riosnliatft isomerflsft 


SAG0764 


230 


nrioKrihoplvcerflte mutasft familv "nrnt^in 


SAG0765 


681 


nenicillin-hindinff rvroteiti 


SAG0766 


198 


rf*c. nmliinfltinii nrotftiri Rpf^R 


SAG0767 


348 


F)-alanine — T^-alanine licraQe 

X— ' (UCUH11V X— / UXC11XL.L1.Vp' XlgCluU 


SAG0768 


455 


T IDP-lNJ-acetvl m uramovlalanvl-T^-cln tam vl-9 6-diarnirior»i - mp»latf : - 

ujl/i i^i u\^\»» tjr nix ux aiAixj y xuicuxjr 1 xv ^xixiuxii y i ^ 7 vj vxiciiiiiiivy lyiiiit/ IcX Lv/ 

D-alanvl-D-alanvl lisase 


SAG0769 


406 


oxalate; formate antinorter 


SAG0770 


228 


membrane protein, putative 


SAG0771 


512 


cell wall surface anchor familv nrotein 


SAG0772 


514 


peptide chain release factor 3 


SAG0773 


126 


conserved hvpothetical protein 


SAG0774 


244 


ABC transporter, ATP-bindine protein 


SAG0775 


220 


ABC transporter, permease protein 


SAG0776 


276 


YaeC family protein, putative 


SAG0777 


528 


ATP-dependent RNA helicase, DEAD/DEAH box familv 


SAG0778 


88 


conserved hvpothetical protein 


SAG0779 


254 


conserved hvpothetical protein 


SAG0780 


246 


acvltransferase familv protein 

W*» T A V A M11L/X V A *k* L»_J ^/ XV411 AAA T A> W AAA 


S AG078 1 


217 


comnetence nrotein CelA 


SAG0782 


745 


DNA intemalization-related comnetence nrotein t^ornT^r^/TJ pcO 

1 * J- ™ a a. llliVlllUllAjUtlVll XVaUIiW* WlllUvVwlivV L/X WtvXli Vw- V / 1 1 1 t V^<* * JLV.V^V^aw 


SAG0783 


269 


hvdrolase. haloacid dehaloeenase-like familv 


SAG0784 


314 


suear-bindine transcriptional regulator. Lad familv 

W V*- > V»* A A A A^*- AAA V A W* 11W A AA^ WAX^AAVvA A ^» fcn %•»• A V\/A • AwVvVA XUIHaX T 


SAG0785 


330 


conserved hvpothetical protein 

X^ A A *J X^ A ■ P A A j A^ X^ ftr A A *r vA M A. A. w A A A 


SAG0786 


242 


conserved domain protein 


SAG0787 


345 


DNA polvmerase III, delta subunit, putative 


SAG0788 


202 


superoxide dismutase, Fe-Mn 


SAG0789 


283 


transcriptional antiterminator LicT 


SAG0790 


622 


PTS system, beta-glucosides-specific IIABC components 


SAG0791 


475 


6-phospho-beta-glucosidase 


SAG0792 


364 


conserved hypothetical protein 


SAG0793 


380 


glycerate kinase 2 


SAG0794 


418 


permease, GntP familv 


SAG0795 


354 


conserved hypothetical protein 


SAG0796 


147 


transcriptional regulator, MarR family 


SAG0797 


342 


S-adenosylmethionine :tRNA ribosyltransferase-isomerase 


SAG0798 


226 


membrane protein, putative 


SAG0799 


233 


glucosamine-6-phosphate isomerase 


SAG0800 


318 


glutathione S-transferase family protein 


SAG0801 


239 


ribosomal small subunit pseudouridine synthase A 


SAG0802 


38 


hypothetical protein 


SAG0803 


383 


major facilitator family protein 
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SAG0804 


315 


competence protein CoiA 


SAG0805 


601 


olieoendopeptidase B 


SAG0806 


208 


hydrolase, haloacid dehalogenase-like family 


SAG0807 


235 


O-methyltransferase family protein 


SAG0808 


309 


protease maturation protein, putative 


SAG0809 


161 


conserved hypothetical protein 


SAG0810 


872 


alanvl-tRNA synthetase 


SAG0811 


238 


membrane protein, putative 


SAG0812 


272 


glycosyl transferase, family 8 


SAG0813 


81 


hypothetical protein 


SAG0814 


95 


conserved hypothetical protein 


SAG0815 


71 


transcriptional regulator, Cro/CI family 


SAG0816 


253 


membrane protein, putative 


SAG0817 


187 


membrane protein, putative 


SAG0818 


319 


ribonucleoside-diphosphate reductase 2, beta subunit 


SAG0819 


719 


ribonucleoside-diphosphate reductase 2, alpha subunit 


SAG0820 


74 


ribonucleoside-diphosphate reductase 2, NrdH-redoxin 


SAG0821 


87 


phosphocarrier protein HPr 

*■ , — .. 


SAG0822 


577 


phosphoenolpyruvate-protein phosphotransferase 


SAG0823 


475 


elvceraldehyde-3-phosphate dehydrogenase, NADP-dependent 


SAG0824 


417 


polysaccharide deacetylase family protein 


SAG0825 


360 


ATP-dependent RNA helicase, DEAD/DEAH box family 


SAG0826 


209 


uridine kinase 


SAG0827 


165 


conserved hypothetical protein 


SAG0828 


554 


DNA polymerase III, gamma and tau subunits 


SAG0829 


64 


conserved hypothetical protein 


SAG0830 


311 


biotin— acetyl-CoA-carboxylase ligase 


SAG0831 


398 


S-adenosylmethionine synthetase 


SAG0832 


753 


protein of unknown function 


SAG0833 


181 


hypothetical protein 


SAG0834 


42 


hypothetical protein 


SAG0835 


188 


conserved hypothetical protein 


SAG0836 


184 


conserved hypothetical protein 


SAG0837 


428 


ABC transporter, ATP-binding protein 


SAG0838 


233 


hypothetical protein 


SAG0839 . 


226 


transcriptional regulator, TenA family 


SAG0840 


265 


phosphomethylpyrimidine kinase 


SAG0841 


256 


hydroxyethylthiazole kinase 


SAG0842 


223 


thiamine-phosphate pyrophosphorylase 


SAG0843 


419 


UDP-N-acetylglucosamine 1 -carboxyvinyltransferase 


SAG0844 


184 


acetyltransferase, GNAT family 


SAG0845 


427 


CBS domain protein 


SAG0846 


286 


methionine aminopeptidase, type I 


SAG0847 


306 


ribonuclease BN, putative 


SAG0848 


151 


GtrA family protein 


SAG0849 


169 


conserved hypothetical protein 


SAG0850 


652 


DNA ligase, NAD-dependent 


SAG0851 


339 


bmrU protein, putative 
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SAG0852 


766 


pullulanase, putative 


SAG0853 


622 


1 ,4-alpha-glucan branching enzyme 


SAG0854 


379 


glucose- 1 -phosphate adenylyltransferase 


SAG0855 


NA 


glycogen biosynthesis protein GlgD, authentic frameshift 


SAG0856 


476 


glycogen synthase 


SAG0857 


66 


ATP synthase F0, C subunit 


SAG0858 


238 


ATP synthase F0, A subunit 


SAG0859 


165 


ATP synthase F0, B subunit 


SAG0860 


178 


ATP synthase Fl, delta subunit 


SAG0861 


501 


ATP synthase Fl, alpha subunit 


SAG0862 


293 


ATP synthase Fl, gamma subunit 


SAG0863 


468 


ATP synthase Fl, beta subunit 


SAG0864 


137 


ATP synthase Fl, epsilon subunit 


SAG0865 


76 


conserved hypothetical protein 


SAG0866 


423 


UDP-N-acetylglucosamine 1 -carboxyvinyltransferase 


SAG0867 


63 


conserved hypothetical protein 


SAG0868 


285 


DNA-entry nuclease 


SAG0869 


346 


phenylalanyl-tRNA synthetase, alpha subunit 


SAG0870 


173 


acetyltransferase, GNAT family 


SAG0871 


801 


phenylalanyl-tRNA synthetase, beta subunit 


SAG0872 


300 


conserved hypothetical protein 


SAG0873 


1077 


exonuclease RexB 


SAG0874 


1207 


exonuclease RexA 


SAG0875 


305 


magnesium transporter, CorA family, putative 


SAG0876 


458 


tRNA modification GTPase TrmE 


SAG0877 


636 


ABC transporter, ATP-binding protein 


SAG0878 


322 


acetoin dehydrogenase, thvmine PPi dependent. El component. 

^uf hr AAA A. AT Vi A, £ y .A.V % k_r m %#A A T A A A A A A X»» JA JA A XA-^^ i""^ A A XA X^ A A w*% . -* A A A A X^ AAV A A wA 

alpha subunit 


SAG0879 


332 


acetoin dehydrogenase, thvmine PPi dependent, El component, 
beta subunit 


SAG0880 


462 


acetoin dehydrogenase, thymine PPi dependent, E2 component, 
dihydrolipoamide acetyltierase 


SAG0881 


585 


acetoin dehydrogenase, thymine PPi dependent, E3 component, 
dihydrolipoamide dehydrogenase 


SAG0882 


329 


lipoate-protein ligase A 


SAG0883 


261 


cobyric acid synthase, putative 


SAG0884 


447 


mur ligase family protein 


SAG0885 


283 


conserved hypothetical protein TIGR00159 


SAG0886 


319 


protein of unknown function 


SAG0887 


450 


phosphoglucomutase/phosphomannomutase family protein 


SAG0888 


123 


conserved hypothetical protein 


SAG0889 


126 


conserved hypothetical protein 


SAG0890 


376 


oxygen-independent coproporphyrinogen III oxidase, putative 


SAG0891 


245 


conserved hypothetical protein 


SAG0892 


256 


hydrolase, haloacid dehalogenase-like family 


SAG0893 


218 


conserved hypothetical protein 


SAG0894 


1370 


protein of unknown function 


SAG0895 


289 


lipoyl-binding domain protein 
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SAG0896 


108 


oxidoreductase, putative 


SAGO 897 


221 


conserved hypothetical protein 


SAG0898 


83 


hypothetical protein 


SAG0899 


57 


hypothetical protein 


SAG0900 


56 


hypothetical protein 


SAG0901 


127 


hypothetical protein 


SAG0902 


45 


hypothetical protein 


SAG0903 


44 


hypothetical protein 


SAG0904 


56 


hypothetical protein 


SAG0905 


138 


nucleoside diphosphate kinase 


SAG0906 


610 


GTP-binding protein LepA 


SAG0907 


877 


protein of unknown function/lipoprotein, putative 


SAG0908 


203 


HD domain protein 


SAG0909 


154 


acetyltransferase, GNAT family 


SAG0910 


144 


PilB-related protein 


SAG0911 


930 


cation-transporting ATPase, E1-E2 family 


SAG0912 


367 


nucleoside diphosphate kinase domain protein 


SAG0913 


212 


chloramphenicol acetyltransferase 


SAG0914 


203 


conserved hypothetical protein 


SAG0915 


405 


Tn916, transposase 


SAG0916 


67 


Tn916, excisionase 


SAG0917 


83 


Tn916, hypothetical protein 


SAG0918 


76 


Tn916, hypothetical protein 


SAG0919 


157 


Tn916 9 hypothetical protein 


SAG0920 


23 


Tn9 1 6, hypothetical protein 


SAG0921 


117 


Tn916, transcriptional regulator, putative 


SAG0922 


61 


Tn916, hypothetical protein 


SAG0923 


639 


Tn916, tetracycline resistance protein 


SAG0924 


28 


Tn916, tetM leader peptide 


SAG0925 


310 


Tn916, hypothetical protein 


SAG0926 


333 


Tn916, NLP/P60 family protein 


SAG0927 


725 


membrane protein, putative 


SAG0928 


NA 


Tn916, hypothetical protein, authentic frameshift 


SAG0929 


168 


Tn916, hypothetical protein 


SAG0930 


165 


Tn9 1 6 t hypothetical protein 


SAG0931 


73 


Tn916, hypothetical protein 


SAG0932 


401 


Tn916, transcriptional regulator, putative 


SAG0933 


461 


Tn916, FtsK/SpoIIIE family protein 


SAG0934 


128 


Tn916, hypothetical protein 


SAG0935 


104 


Tn916, hypothetical protein 


SAG0936 


39 


Tn916, hypothetical protein 


SAG0937 


NA 


ABC transporter, ATP-binding protein, authentic frameshift 


SAG0938 


122 


transcriptional regulator, GntR family 


SAG0939 


1034 


DNA polymerase III, alpha subunit 


SAG0940 


340 


6-phosphofructokinase 


SAG0941 


500 


pyruvate kinase 


SAG0942 


185 


signal peptidase I, putative 


SAG0943 


47 


hypothetical protein 
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SAG0944 


604 


glucosamine--fructose-6-phosphate aminotransferase, lsomenzmg 


SAG0945 


377 


IS 1 548, transposase 


SAG0946 


109 


phnA protein 


SAG0947 


213 


amino acid ABC transporter, permease protein 


SAG0948 


209 


amino acid ABC transporter, ATP-binding protein 


SAG0949 


276 


amino acid ABC transporter, amino acid-binding protein 


SAG0950 


82 


ribosomal protein S20 


SAG0951 


306 


pantothenate kinase 


SAG0952 


196 


conserved hypothetical protein 


SAG0953 


129 


cytidine deaminase 


SAG0954 


349 


protein of unknown function/lipoprotein, putative 


SAG0955 


511 


sugar ABC transporter, ATP-binding protein 


SAG0956 


353 


sugar ABC transporter, permease protein, putative 


SAG0957 


318 


sugar ABC transporter, permease protein, putative 


SAG0958 


456 


NADH oxidase 


SAG0959 


329 


L-lactate dehydrogenase 


SAG0960 


819 


DNA gyrase, A subunit 


SAG0961 


247 


sortase SrtA 


SAG0962 


137 


glyoxylase family protein 


SAG0963 


320 


conserved hypothetical protein 


SAG0964 


375 


Na+/H+ exchanger family protein 


SAG0965 


127 


IS 1381, transposase OrfA 


SAG0966 


129 


IS 1381, transposase OrfB 


SAG0967 


520 


GMP synthase 


SAG0968 


232 


transcriptional regulator, GntR family 


SAG0969 


444 


rid protein 


SAG0970 


247 


acetyltransferase, GNAT family 


SAG0971 


282 


protein of unknown function/lipoprotein, putative 


SAG0972 


NA 


conserved hypothetical protein, authentic frameshift 


SAG0973 


320 


nisin-resistance protein, putative 


SAG0974 


250 


ABC transporter, ATP-binding protein 


SAG0975 


651 


ABC transporter, permease protein, putative 


SAG0976 


222 


DNA-binding response regulator 


SAG0977 


312 


sensor histidine kinase 


SAG0978 


356 


site-specific recombinase, phage integrase family 


SAG0979 


553 


ABC transporter, substrate-binding protein 


SAG0980 


257 


conserved hypothetical protein 


SAG0981 


228 


satD protein 


SAG0982 


521 


signal recognition particle protein Ffh 


SAG0983 


110 


conserved hypothetical protein 


SAG0984 


437 


sensor histidine kinase CiaH 


SAG0985 


226 


DNA-binding response regulator CiaR 


SAG0986 


849 


aminopeptidase N 


SAG0987 


217 


phosphate transport system regulatory protein PhoU 


SAG0988 


252 


phosphate ABC transporter, ATP-binding protein PstB, putative. 


SAG0989 


267 


phosphate ABC transporter, ATP-binding protein PstB, putative 


SAG0990 


295 


phosphate ABC transporter, permease protein PstA, putative 


SAG0991 


| 305 


phosphate ABC transporter, permease protein 
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SAG0992 


286 


phosphate ABC transporter, phosphate-binding protein 


SAG0993 


436 


NOLl/NOP2/sun family protein 


SAG0994 


254 


inositol monophosphatase family protein 


SAG0995 


93 


conserved hypothetical protein 


SAG0996 


137 


conserved hypothetical protein 


SAG0997 


310 


macrolide-efflux protein mreA/riboflavin biosynthesis protein 






RibF 


SAG0998 


294 


tRNA pseudouridine synthase B 


SAG0999 


143 


acetyltransferase, GNAT family 


SAG 1000 


423 


conserved hypothetical protein 


SAG 1001 


196 


conserved hypothetical protein 


SAG 1002 


292 


protease, putative 


SAG 1003 


876 


permease, putative 


SAG 1004 


233 


ABC transporter, ATP-binding protein 


SAG1005 


706 


DNA topoisomerase I 


SAG 1006 


280 


DprA/SMF protein, putative DNA processing factor 


SAG1007 


342 


iron-compound ABC transporter, iron-compound-binding protein 


SAG 100 8 


253 


iron compound ABC transporter, ATP-binding protein 


SAG1009 


324 


iron compound ABC transporter, permease protein 


SAG1010 


320 


iron compound ABC transporter, permease protein 


SAG1011 


182 


acetyltransferase, CysE/LacA/LpxA/NodL family 


SAG1012 


253 


ribonuclease HII 


SAG1013 


283 


GTP-binding protein 


SAG1014 


190 


conserved hypothetical protein 


SAG1015 


494 


carbon starvation protein CstA, putative 


SAG1016 


244 


response regulator 


SAG1017 


579 


sensor histidine kinase, putative 


SAG1018 


40 


lipoprotein, putative 


SAG1019 


39 


hypothetical protein 


SAG 1020 


227 


lipoprotein, putative 


SAG1021 


107 


hypothetical protein 


SAG 1022 


177 


hypothetical protein 


SAG 1023 


48 


hypothetical protein 


SAG 1024 


183 


lipoprotein, putative 


SAG 1025 


149 


hypothetical protein 


SAG 1026 


NA 


immunogenic secreted protein, degenerate 


SAG 1027 


84 


conserved hypothetical protein 


SAG1028 


196 


hypothetical protein 


SAG 1029 


101 


hypothetical protein 


SAG1030 


304 


protein of unknown function 


SAG 1031 


120 


conserved domain protein 


SAG1032 


85 


conserved hypothetical protein 


SAG 103 3 


1309 


FtsK/SpoIIIE family protein 


SAG 1034 


55 


hypothetical protein 


SAG 103 5 


424 


conserved hypothetical protein 


SAG 103 6 


80 


conserved hypothetical protein 


SAG 103 7 


157 


hypothetical protein 


SAG 103 8 


1003 


phage infection protein, putative 
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SAG1039 


96 


conserved hypothetical protein 


SAG1040 


260 


conserved domain protein 


SAG 1041 


107 


hypothetical protein 


SAG 1042 


1060 


carbamoyl-phosphate synthase, large subunit 


SAG 1043 


358 


carbamoyl-phosphate synthase, small subunit 


SAG 1044 


307 


aspartate carbamoyltransferase 


SAG1045 


430 


dihydroorotase, multifunctional complex type 


SAG 1046 


209 


orotate phosphoribosyltransferase 


SAG1047 


233 


orotidine 5-phosphate decarboxylase 


SAG 1048 


410 


membrane protein, putative 


SAG1049 


513 


ABC transporter, ATP-binding protein 


SAG 1050 


112 


ribonucleotide reductase, truncation 


SAG 1051 


358 


aspartate-semialdehyde dehydrogenase 


SAG 1052 


47 


cell wall surface anchor family protein, putative 


SAG1053 


30 


hypothetical protein 


SAG1054 


531 


cardiolipin synthetase 


SAG 1055 


556 


formate—tetrahydrofolate ligase 


SAG 1056 


339 


lipoate-protein ligase A 


SAG1057 


292 


conserved hypothetical protein 


SAG1058 


272 


conserved hypothetical protein 


SAG 1059 


110 


glycine cleavage system H protein, putative 


SAG 1060 


328 


bacterial luciferase family protein 


SAG 1061 


399 


oxidoreductase, FMN-binding 


SAG 1062 


282 


lipoate-protein ligase A family protein 


SAG 1063 


228 


flavoprotein-related protein 


SAG 1064 


180 


flavoprotein family protein 


SAG 1065 


190 


membrane protein, putative 


SAG 1066 


572 


phosphoglucomutase 


SAG 1067 


178 


IS861, transposase OrfA 


SAG 1068 


277 


IS861, transposase OrfB 


SAG1069 


65 


hypothetical protein 


SAG 1070 


577 


ABC transporter, ATP-binding/permease protein 


SAG1071 


573 


ABC transporter, ATP-binding/permease protein 


SAG1072 


200 


conserved hypothetical protein 


SAG 1073 


325 


conserved hypothetical protein 


SAG1074 


418 


serine hydroxymethyltransferase 


SAG 1075 


183 


Sua5/YciO/YrdC/YwlC family protein 


SAG 1076 


276 


modification methylase, HemK family 


SAG 1077 


359 


peptide chain release factor 1 


SAG 1078 


189 


thymidine kinases 


SAG 1079 


60 


4-oxalocrotonate tautomerase 


SAG 1080 


47 


hypothetical protein 


SAG 1081 


312 


ApbE family protein 


SAG 1082 


200 


conserved hypothetical protein 


SAG1083 


411 


conserved hypothetical protein 


SAG 1084 


262 


formate/nitrite transporter family protein 


SAG 1085 


424 


xanthine permease 


SAG1086 


193 


xanthine phosphoribosyltransferase 
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SAG1087 


327 


guanosme monophosphate reductase 


SAG1088 


446 


drug resistance transporter, EmrB/QacA family, putative 


SAG1089 


230 


conserved hypothetical protein 


SAG 1090 


666 


potassium uptake protein, putative 


SAG1091 


216 


oxidoreductase, short chain dehydrogenase/reductase family 


SAG 1092 


330 


phosphate acetyltransferase 


SAG 1093 


294 


nbosomal large subunit pseudoundine synthase, RluD subfamily 


SAG 1094 


278 


conserved hypothetical protein 


SAG1095 


223 


GTP pyrophosphokmase family protein 


SAG 1096 


190 


conserved hypothetical protein 


SAG1097 


324 


ribose-phosphate pyrophosphokinase 


SAG 1098 


371 


cysteine desulphurase 


SAG 1099 


115 


conserved hypothetical protein 


SAG1100 


210 


conserved hypothetical protein 


SAG1101 


226 


DNA repair protein RadC 


SAG1102 


377 


4; * - • 

membrane protein, putative 


SAG 1103 


478 


6-phospho-beta-glucosidase 


SAG1104 


204 


platelet activating factor, putative 


SAG 1105 


273 


444 44 * 4 4 4 4 4 *1 /""* *1 

hydrolase, haloacid dehalogenase-hke family 


SAG 1106 


309 


transcriptional regulator, AraC family, putative 


SAG1107 


510 


voltage-gated chloride channel family protein 


SAG1108 


357 


spermidme/putrescme ABC transporter, spermidme/putrescme- 

4 • 1 • • 

binding protein 


SAG1109 


258 


spermidme/putrescme ABC transporter, permease protein 


SAG1110 


264 


spermidme/putrescine ABC transporter, permease protein 


SAG1 1 1 1 


384 


spermidine/putrescine ABC transporter, ATP-binding protein 


SAG1112 


300 


UDP-N-acetylenolpyruvoylglucosamine reductase 


SAG1113 

• 


162 


2-ammo-4-hydroxy-6-hydroxymethyldihydroptendme 
pyrophosphokmase 


SAG1114 


120 


dihydroneoptenn aldolase 


SAG1115 


267 


dihydropteroate synthase 


SAG1116 


187 


GTP cyclohydrolase I 


SAG1117 


420 


folylpolyglutamate synthase 


SAG1118 


295 


rarD protein 


SAG1119 


288 


homosenne kinase 


SAG1120 


427 


homosenne dehydrogenase 


SAG1121 


295 


polysacchande deacetylase family protein 


SAG 1122 


515 


transporter, BCCT family protein 


SAG 1123 


34 


hypothetical protein 


SAG 1124 


458 


aldehyde dehydrogenase family protein 


SAG1125 


335 


membrane protein, putative 


SAG 1126 


228 


protein of unknown function 


SAG 1127 


446 


conserved domain protein 


SAG1 128 


65 


transcriptional regulator, Cro/CI family 


SAG 1129 


36 


hypothetical protein 


SAG1130 


49 


hypothetical protein 


SAG1131 


164 


thiol peroxidase 


SAG1132 


219 


conserved hypothetical protein 
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SAG 1133 


254 


conserved hypothetical protein 


SAG1134 


213 


transcriptional regulator, GntR family/potassioum uptake protein, 
TrkA family 


SAG1135 


183 


gls24 protein, putative 


SAG1136 


65 


conserved hypothetical protein 


SAG1137 


180 


gls24 protein, putative 


SAG1138 


64 


conserved hypothetical protein 


SAG1139 


193 


conserved hypothetical protein 


SAG1 140 


. 82 


conserved hypothetical protein 


SAG1141 


112 


conserved hypothetical protein 


SAG1 142 


759 


ATP-dependent DNA helicase PcrA 

C — 


SAG1 143 


128 


conserved hypothetical protein 


SAG1144 


441 


uracil permease 


SAG1 145 


448 


sodium:alanine symporter family protein 


SAG1 146 


411 


cation efflux family protein 


SAG1 147 


130 


conserved hypothetical protein 


SAG1 148 


231 


membrane protein, putative 


SAG1 149 


207 


lipoprotein, putative 


SAG 1150 


400 


ribosomal protein SI 


SAG1151 


76 


conserved hypothetical protein 


SAG 11 52 


340 


branched-chain amino acid aminotransferase 


SAG 1153 


819 


DNA topoisomerase IV, A subunit 


SAG1154 


653 


DNA topoisomerase IV, B subunit 


SAG1155 


212 


membrane protein, putative 


SAG1156 


217 


uracil-DNA glycosylase 


SAG1157 


161 


conserved hypothetical protein 


SAG1158 


413 


CMP-N-acetylneuraminic acid synthetase NeuA 


SAG1159 


209 


neuD protein 


SAG 11 60 


384 


UDP-N-acetylglucosamine-2-epimerase NeuC 


SAG1161 


341 


N-acetyl neuramic acid synthetase NeuB 


SAG1162 


466 


polysaccharide biosynthesis protein CpsL 


SAG 1163 


318 


polysaccharide biosynthesis protein CpsK(V) 


SAG 1164 


321 


glycosyl transferase CpsJ(V) 


SAG 1165 


327 


glycosyl transferase CpsO(V) 


SAG 1166 


295 


glycosyl transferase CpsN(V) 


SAG 1167 


241 


polysaccharide biosynthesis protein CpsM(V) 


SAG1168 


364 


polysaccharide biosynthesis protein cpsH(V) 


SAG 11 69 


163 


glycosyl transferase CpsG(V) 


SAG 11 70 


149 


polysaccharide biosynthesis protein CpsF 


SAG1171 


462 


glycosyl transferase CpsE 


SAG 11 72 


229 


cpsD protein 


SAG1173 


230 


cpsC protein 


SAG 1174 


243 


capsular polysaccharide biosynthesis protein CpsB 


SAG 11 75 


485 


capsular polysaccharide biosynthesis protein CpsA 


SAG 1176 


290 


transcriptional regulator, LysR family, putative 


SAG 1177 


255 


conserved hypothetical protein 


SAG1178 


236 


purine nucleoside phosphorylase 


SAG 1179 


418 


voltage-gated chloride channel family protein, putative 
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SAG1180 


269 


purine nucleoside pnospnoryiase 


SAG1181 | 


135 


arsenate reductase 


SAG 11 82 


403 


pho sphopentomutase 


SAG 11 83 


223 


nbose 5-pnospnate isomerase 


SAG1184 


236 


conserved hypothetical protein 


SAG1185 


262 


tributyrin esterase 


SAG1186 


553 


metallo-beta-lactamase superiamily protein 


SAG1187 


253 


ABC transporter, ATP-bmdmg protein 


SAG1188 


287 


ABC transporter, permease protein 


SAG1189 


334 


conserved hypothetical protein 


SAG1190 


551 


adherence and virulence protein A 


SAG1191 


239 


alpha-acetolactate decarboxylase 


SAG1192 


560 


acetolactate synthase, catabolic 


SAG1193 


408 


TPR domain protein 


SAG 11 94 


396 


membrane protein, putative 


SAG1195 


153 [ 


MutT/nudix family protein 


SAG 11 96 


160 1 


mutator MutT protein 


SAG1197 


1072 


hyaluronidase 


SAG1198 


348 


dTDP-glucose 4,6-denyaratase 


SAG1199 


197 


dTDP-4-denyarornamnose 3p-epimerase H 


SAG 1200 


i*^ 

289 j 


glucose- 1 -phosphate tnymiayiyitransierase 


SAG1201 


367 1 


lmmodiacetate oxidase, putative 


SAG 1202 


262 


conserved hypothetical protein 1 ICjKUU4oo 


SAG 1203 


227 


conserved hypothetical protein 


SAG 1204 


226 


DNA replication protein DnaD, putative 


SAG 1205 


172 


adenine phosphoribosyltransferase 


SAG 1206 


854 


conserved domain protein 


SAG 1207 


32 


1 hypothetical protein 


SAG1208 


732 


| smgle-stranded-DNA-speciric exonuclease Kecj 


SAG1209 


253 


j oxidoreductase, short chain dehydrogenase/reductase iamny 


SAG1210 


j~\ 

309 


metallo-beta-lactamase superiamily protein 


SAG1211 


215 


| conserved hypothetical protein 


SAG1212 


412 


GTP-bmdmg protein HtlX 


SAG1213 


296 


| tRNA delta(2)-isopentenylpyropnospnate transierase 


SAG1214 


58 


I hypothetical protein 


SAG1215 


305 


| exfoliative toxm A, putative 


SAG1216 


1252 


Tpullulanase, putative 


SAG1217 


NA 


conserved hypothetical protein, autnentic tramesnin 


SAG1218 


194 


1 conserved hypothetical protein 


SAG1219 


468 


I peptidase, M20/M25/M40 ramily 


oAvj 1ZZU 




nifrnr^Hnptfl^ft fatnilv nrotein 


SAG 1221 


NA 


glycerophosphoryl diester phosphodiesterase, putative, authentic 
point mutation 


SAG 1222 


593 


excinuclease ABC, C subunit 


SAG 1223 


255 


1 conserved hypothetical protein 


SAG 1224 


446 


MATE efflux family protein 


SAG 1225 


136 


conserved hypothetical protein 


SAG 1226 


165 


| conserved hypothetical protein 
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SAG1227 


198 


protein of unknown function 


SAG1228 


96 


ISSdyl , transposase OrfA 


SAG 1229 


259 


ISSdyl, transposase OrfB 


SAG1230 


96 


conserved hypothetical protein 


SAG 1231 


NA 


transposase OrfB, IS3 family, degenerate 


SAG 1232 


77 


transposase OrfB, IS3 family, truncation 


SAG 123 3 


822 


streptococcal histidine triad family protein 


SAG 1234 


306 


laminin-binding surface protein 


SAG 123 5 


425 


GBSil, group II intron, maturase 


SAG1236 


NA 


C5a peptidase, authentic frameshift 


SAG1237 


444 


hypothetical protein 


SAG 123 8 


202 


hypothetical protein 


SAG1239 


76 


conserved hypothetical protein 


SAG 1240 


125 


conserved hypothetical protein, truncation 


SAG 1241 


91 


transposase OrfA, IS3 family 


SAG 1242 


67 


transposase OrfB, IS3 family, truncation 


SAG 1243 


96 


ISSdyl, transposase OrfA 


SAG 1244 


259 


ISSdyl, transposase OrfB 


SAG1245 


38 


hypothetical protein 


SAG 1246 


389 


hypothetical protein 


SAG 1247 


399 


site-specific recombinase, phage integrase family 


SAG 1248 


75 


conserved hypothetical protein 


SAG1249 


74 


transcriptional regulator, Cro/CI family 


SAG 1250 


621 


Tn5252, relaxase 


SAG 1251 


121 


Tn5252, Orf 9 protein 


SAG1252 


120 


Tn5252, Orf 10 protein 


SAG1253 


435 


transposase, ISL3 family 


SAG 1254 


546 


mercuric reductase 


SAG1255 


130 


mercuric resistance operon regulatory protein MerR 


SAG1256 


142 


IS861, transposase OrfB, truncation 


SAG1257 


709 


cation-transporting ATPase, E1-E2 family 


SAG1258 


122 


cadmium efflux system accessory protein 


SAG1259 


99 


conserved hypothetical protein 


SAG 1260 


262 


hypothetical protein 


SAG1261 


198 


conserved hypothetical protein 


SAG1262 


695 


cation-transporting ATPase, E1-E2 family 


SAG 1263 


NA 


conserved domain protein, authentic frameshift 


SAG 1264 


148 


transcriptional repressor CopY, putative 


SAG 1265 


206 


cadmium resistance transporter, putative 


SAG 1266 


152 


hypothetical protein 


SAG1267 


108 


hypothetical protein 


SAG1268 


230 


repressor protein, putative 


SAG1269 


44 


hypothetical protein 


SAG 1270 


471 


ImpB/MucB/SamB family protein 


SAG1271 


116 


conserved hypothetical protein 


SAG 1272 


102 


conserved hypothetical protein 


SAG 1273 


118 


conserved hypothetical protein 


SAG 1274 


129 


conserved hypothetical protein 
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SAG1275 


75 


hypothetical protein 


SAG 1276 


358 


conserved hypothetical protein 


SAG 1277 


163 


hypothetical protein 


SAG1278 


96 


hypothetical protein 


SAG 1279 


99 


conserved domain protein 


SAG 1280 


2274 


SNP2 family protein 


SAG1281 


183 


hypothetical protein 


SAG 1282 


63 


calcium-binding protein, putative 


SAG1283 


1631 


agglutinin receptor 


SAG1284 


196 


abortive infection protein AbiGI 


SAG1285 


281 


abortive infection protein AbiGII 


SAG 1286 


933 


Tn5252, Orf28 


SAG1287 


776 


Tn5252, Orf26 


SAG1288 


NA 


Tn5252, Orf25, degenerate 


SAG 1289 


284 


Tn5252, Orf23 


SAG 1290 


80 


hypothetical protein 


SAG 1291 


605 


Tn5252, Orf 21 protein, internal deletion i 


SAG 1292 


162 


hypothetical protein 


SAG 1293 


194 


protease, putative 


SAG 1294 


77 


conserved hypothetical protein 


SAG 1295 


127 


conserved hypothetical protein 


SAG 1296 


142 


conserved hypothetical protein 


SAG 1297 


451 


C-5 cytosine-specific DNA methylase 

« — *- — 


SAG1298 


31 


hypothetical protein 


SAG 1299 


272 


conserved hypothetical protein 


SAG1300 


57 


conserved hypothetical protein 


SAG 1301 


121 


ribosomal protein L7/L12 


SAG 1302 


166 


ribosomal protein. LI 0 


SAG 13 03 


702 


ATP-dependent Clp protease, ATP-binding subunit 


SAG1304 


32 


hypothetical protein 


SAG1305 


314 


homocysteine S-methyltransferase MmuM, putative 


SAG 1306 


458 


amino acid permease 


SAG 1307 


216 


hypothetical protein 


SAG 1308 


167 


hypothetical protein 


SAG 1309 


30 


hypothetical protein 


SAG1310 


182 


transcriptional regulator, TetR family 


SAG1311 


198 


GTP-binding protein 


SAG1312 


408 


ATP-dependent Clp protease, ATP-binding subunit ClpX 


SAG1313 


56 


conserved hypothetical protein 


SAG1314 


164 


dihydrofolate reductase 


SAG1315 


279 


thymidylate synthase 


SAG1316 


390 


HMG-CoA synthase 


SAG1317 


427 


3-hydroxy-3-methylglutaryl-CoA reductase 


SAG1318 


149 


conserved hypothetical protein 


SAG1319 


214 


hemolysin III, putative 


SAG 1320 


304 


conserved hypothetical protein TIGR00147 


SAG 1321 


284 


glutathione S-transferase family protein, putative 


SAG1322 


72 


conserved domain protein 
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SAG1323 


331 


isopentenyl-diphosphate delta-isomerase 


SAG 1324 


330 


phosphomevalonate kinase 


SAG1325 


314 


diphosphomevalonate decarboxylase 


SAG 1326 


292 


mevalonate kinase, putative 


SAG 1327 


409 


sensor histidine kinase 


SAG 1328 


228 


DNA-binding response regulator 


SAG1329 


208 


GTP pyrophosphokinase family protein 


SAG 1330 


68 


hypothetical protein 


SAG 1331 


979 


R5 protein 


SAG 1332 


146 


transcriptional regulator, MarR family, putative 


SAG1333 


690 


5'-nucleotidase family protein 


SAG1334 


136 


polypeptide deformylase, putative 


SAG1335 


449 


NADP-specific glutamate dehydrogenase 


SAG1336 


169 


membrane protein, putative 


SAG1337 


589 


ABC transporter, ATP-binding/permease protein 


SAG1338 


579 


ABC transporter, ATP-binding/permease protein 


SAG1339 


157 


acetyltransferase, GNAT family 


SAG 1340 


622 


ABC transporter, ATP-binding protein 


SAG 1341 


402 


polyA polymerase family protein 


SAG 1342 


282 


DegV family protein 


SAG 1343 


126 


protein of unknown function 


SAG 1344 


177 


hypothetical protein 


SAG1345 


164 


conserved hypothetical protein 


SAG 1346 


654 


PTS system, fructose specific IIABC components 


SAG1347 


303 


1 -phosphofractokinase 


SAG 1348 


247 


lactose phosphotransferase system repressor 


SAG 1349 


411 


beta-lactam resistance factor 


SAG 13 50 


544 


surface antigen-related protein 


SAG1351 


307 


2-dehydropantoate 2-reductase, putative 


SAG1352 


356 


regulatory protein, putative 


SAG1353 


330 


pyridine nucleotide-disulphide oxidoreductase family protein 


SAG 13 54 


251 


tRNA (guanine-Nl)-methyltransferase 


SAG1355 


172 


16S rRNA processing protein RimM 


SAG1356 


503 


transcriptional regulator, RofA family 


SAG1357 


80 


KH domain protein 


SAG1358 


90 


ribosomal protein S 1 6 


SAG1359 


415 


permease, putative 


SAG 13 60 


236 


ABC transporter, ATP-binding protein 


SAG1361 


414 


conserved hypothetical protein 


SAG 1362 


532 


carbamoyl-phosphate synthase, large subunit, putative 


SAG1363 


356 


carbamoyl-phosphate synthase, small subunit 


SAG 13 64 


173 


pyrimidine operon regulatory protein 


SAG1365 


296 


ribosomal large subunit pseudouridine synthase, RluD subfamily 


SAG1366 . 


154 


lipoprotein signal peptidase 


SAG 13 67 


301 


transcriptional regulator, LysR family 


SAG1368 


94 


ribosomal protein L27 


SAG 13 69 


112 


conserved hypothetical protein 


SAG 1370 


104 


ribosomal protein L21 
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-TYlllltJ 1«J LIU 11 


SAG1371 


392 


conserved hvoothetical nrotein 


SAG 13 72 


404 


thiamine biosvnthesis nrotein ThiT 

UAXWAlilJl&V T XXVXX%^LjrXk7 LIA V kvlll JL X X JL X 


SAG1373 


381 


cvsteine desulnhurase 


SAG1374 


150 


conserved nomothetical nrotein 

w VyJLlu wX_ Y VU XX V Lf\J U1V UvUi LJX V/ IV^AxX 


SAG1375 


449 


Piiitathione reductase 


SAG 1376 


111 

x X X 


conserved TivnotiiPtipal "rvrntf^in 


SAG 1377 


388 


^iivjx ioijljlci. tv/ ayiiuiuov 


SAG1378 


355 


3 -H pHvH mm iiti a fi* cvntViacf* 
j u.t/ixjr \j.iv»t|u.iiicitC' c>y liuia&v 


SAG 13 79 


225 


^-rlf^iivHTnnniTiatp d*=»'hvriTjii'ac<a 

-J UX/Iljr IXlULjlXllldUC/ vilCttdoC 


SAG 13 80 


385 


f»r»tiQf*T'Vf*r1 Vi\mr»i : hf > 'Kr*a1 r\rr^f p«in 

l/UllOvl VOu lljr JLJULlldlV/Ctl LIlVJLdil 


SAG1381 


714 

/ 1*T 


Qnlfn taQf* 

DUlldUlOv 


SAG1382 


1 19 


IIUUdUHIuI JlJI KJ Lv^lll Jl^zJ\J 


SAG 13 83 


66 


ribosomal protein L35 


SAG1 ^84 


176 


LI u.llol<lllUil lillUa.LHJIl IdL/LUl JJT-O 


SAG1385 


227 


PVi"iiiv1jitp Iritiacf* 

Vvjf LIVljr lulu JVlllclot/ 


SAG1386 


174 

X / ™T 


pnn qrtvpH livnntiiPiir'Jil tirntpin 

vUUOvl VVU "Jf pUliiCllval (JxiJldll 


SAG 13 87 


65 


lv/1 1 t/LIU^VIllj ti c 


SAG1388 


I 163 


PrtfidPTVPrl Vi\/t^r»i"ViP'Hp5i1 r%rr^t**in 
C'VJllot/l Vt/U iLy |JvitllCLlV^al |JHJIC1I1 


SAG 13 89 


406 


■npntifiJiQf* T 


SAG1390 


544 


no1\/QJi ppTinnHp T-ii <~\c\mf"Ti f»ci c tTrritptTi tmitck'ti'xrt* 

vjsJiy otxC-^/llcll Ivlt/ L/HJojf lltllC/olo JJlVJLC/lllj |JUl£lllVC 


SAG1391 


484 


T TT^P— NT-appfvl imiTfiTTlfi vial atrvl-T^-orln tama+p— *7 ^-rJ-J amiTiAnimAlofo 
w xv x x^i c*^t«ijr iiiiuxciiiiLiy icii<»iijr i~xV-glUI^IIxa 

lipase 


SAG1392 


264 


iron fionrnon nd A tm v\ <zv\ rwtf^T A nrP-.V»inr1in <r r\rr\tf>\*n 

XXKJLX W Uiil U v/ UX1U. riXJV H CHlijL?\Jl Ivl j JV X X 1 Jllli.ll 1 It* LIJHJLC1H 


SAG1393 


310 


UUu V^VJlllUiwllllvX XTJL>V^ LI CUlaL/UJL tCt y oUUaUalv UUIUIU^ piOlCin 


SAG 13 94 


341 


IxVJll V^UlllJLJUUllLi ^VJ_>V> UdlloJJL91Ld 9 jJCl IllCdbC piUlClIl 


SAG1395 

**JX LVJ X .S *J 


333 
*j +j 


1TOTI PfiTTTnoilTlfi ARP fra "n qt^ nrtpf Tiprmpacp r\mfpin 
ll*Jll C'CHllUJv/lXllU. /TJJV^ LltlllajJVJl Lvl, lllCctoC jJIvJlClll 


SAG 1396 


217 

Z< A / 


pr*TiQPT*\/pr1 Vi un r\t""h r* a 1 nmtpin 


SAG 1397 


311 

— ' X X 


inoi*o r aTi'ip r>"vronVioQr»iiatn qp manoa'npcp-flptipnHpnf 

HAv?xg«,iii^ j-'j' i upilUoj/llalaovj lHclllg 1 clllC'i3C : LlcpcHilvIll 


SAG1398 


262 


nvniva fp forrnHtp-l vh cp_ap"H\/p liner ptiTump 
j-'jr x ix v ctL^> xv/iiiicttv/ ij uou ai/ii vciiiiig, ciLZjVillv 


SAG1399 


444 

i ■ r 


C!RSl donriaiTi Tvrotftin 


SAG 1400 


188 


conserved hvnothetical nrotein ! 

vvlluwi w\x xxy IXXV UVCXX UXUlwlXX 


SAG1401 


311 

^/ X X 


conserved hvnothetical nrotein TTGR0 1 212 

\S \J *.X&\S A. i VU XX j UvUlvllVUl pxuivXix 1 XVJXW X M 1 u 


SAG 1402 


213 

m*m A. 


PAP2 familv nrotein 1 

X X XX. J.* lirllllll Y UlVlvlAl | 


SAG 1403 


194 


membrane protein, putative ! 


SAG 1404 

Mr ^M ^^^^ » V 


308 


cell wall surface anchor familv nrotein 1 


SAG1405 


294 


sortase familv nrotein 


SAG 1406 


293 


sortase familv nrotein 1 

mVX L4XMV XMilAlJLJf L/XV/ kVlXX 


SAG1407 


705 


cell wall surface anchor familv nrotein 1 

VVXl t » MXX frJ XVfW U11V11UX XtAXXXXi Y L^X V/ tvlll 3 


SAG 140 8 


901 


cell wall surface anchor familv nrotein 

vvXl VYUX1 OUXIUVV UXXVXXV/X XtXXXXlJLV |JXVl<wllX 


SAG 1409 


NA 


roeB protein, authentic frameshift 


SAG1410 


379 


glycosyl transferase, group 1 family protein | 


SAG1411 


282 


glycosyl transferase, group 2 family protein | 


SAG1412 


474 


polysaccharide biosynthesis protein | 


SAG1413 


454 


membrane protein, putative j 


SAG1414 


308 


glycosyl transferase, group 2 family protein | 


SAG1415 


311 


glycosyl transferase, group 2 family protein | 


SAG1416 


352 


nucleotide sugar dehydratase, putative 


SAG1417 


240 


nucleotidyl transferase, putative ] 
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SAG1418 


274 


polysaccharide biosynthesis protein, putative 


SAG1419 


577 


lipoprotein, putative 


SAG 1420 


117 


conserved hypothetical protein 


SAG 1421 


243 


glycosyl transferase, group 2 family protein 


SAG 1422 


313 


glycosyl transferase, group 2 family protein 


SAG1423 


384 


glycosyl transferase, putative 


SAG1424 


284 


dTDP-4-dehydrorhamnose reductase 


SAG 1425 


113 


conserved hypothetical protein 


SAG1426 


369 


RNA polymerase sigma-70 factor 


SAG1427 


602 


DNA primase 


SAG1428 


125 


large conductance mechanosensitive channel protein 


SAG1429 


58 


ribosomal protein S21 


SAG 1430 


167 


conserved hypothetical protein 


SAG 1431 


268 


amino acid ABC transporter, amino acid-binding protein 


SAG1432 


347 


ammonium transporter family protein 


SAG1433 


375 


conserved hypothetical protein 


SAG1434 


328 


rhodanese family protein 


SAG1435 


101 


conserved hypothetical protein 


SAG1436 


457 


glycerol-3-phosphate transporter, putative 


SAG1437 


55 


hypothetical protein 


SAG1438 


754 


glycogen phosphorylase 


SAG1439 


498 


4-alpha-glucanotransferase 


SAG 1440 


342 


maltose operon repressor MalR, putative 


SAG 1441 


415 


maltose/maltodextrin ABC transporter, maltose/maltodextrin- 
binding protein 


SAG 1442 


456 


maltose ABC transporter, permease protein 


SAG 1443 


278 


maltose ABC transporter, permease protein 


SAG 1444 


490 


proton/peptide symporter family protein 


SAG1445 


NA 


MutT/nudix family protein, authentic frameshift 


SAG 1446 


' 62 


hypothetical protein 


SAG 1447 


441 


conserved hypothetical protein 


SAG1448 


502 


glycosyl transferase, group 1 family protein 


SAG1449 


795 


preprotein translocase SecA subunit, putative 


SAG1450 


330 


conserved domain protein 


SAG 1451 


494 


conserved hypothetical protein 


SAG1452 


514 


conserved hypothetical protein 


SAG1453 


409 


preprotein translocase SecY family protein 


SAG1454 


398 


glycosyl transferase, putative 


SAG 1455 


295 


glycosyl transferase, group 2 family protein 


SAG1456 


NA 


glycosyl transferase, family 8, degenerate 


SAG 145 7 


129 


IS 1 3 8 1 , transposase OrfB 


SAG 145 8 


127 


IS 1 3 8 1 , transposase Orf A 


SAG1459 


413 


glycosyl transferase family 8 


SAG 1460 


401 


glycosyl transferase, family 8 


SAG 1461 


335 


conserved hypothetical protein 


SAG 1462 


970 


cell wall surface anchor family protein 


SAG 1463 


NA 


transcriptional regulator, RofA family, authentic point mutation 


SAG 1464 


663 


excinuclease ABC, B subunit 
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SAG1465 


306 


protease, putative 


SAG1466 


727 


glutamine ABC transporter, glutamine-binding protein/permease 
protein 


SAG 1467 


246 


glutamine ABC transporter, ATP-binding protein, GlnQ putative 


SAG1468 


116 


conserved hypothetical protein 


SAG1469 


52 


conserved hypothetical protein 


SAG1470 


. 437 


GTP-binding protein, GTPl/Obg family 


SAG 1471 


42 


conserved hypothetical protein 


SAG 1472 


413 


aminopeptidase PepS 


SAG1473 


192 


cell wall surface anchor family protein 


SAG 1474 


680 


amidase family protein 


SAG1475 


240 


ribosomal small subunit pseudouridine synthase A 


SAG1476 


280 


oxidoreductase, aldo/keto reductase family 


SAG 1477 


224 


nitroreductase family protein 


SAG1478 


130 


lactoylglutathione lyase 


SAG 1479 


308 


glycosyl transferase, group 2 family protein 


SAG1480 


462 


amino acid permease 


SAG1481 


155 


SsrA-binding protein 


SAG1482 


801 


exoribonuclease, VacB/Rnb family 


SAG1483 


78 


preprotein translocase, SecG subunit 


SAG 1484 


48 


ribosomal protein L33 


SAG1485 


389 


multi-drug resistance protein 


SAG1486 


548 


membrane protein, putative 


SAG1487 


233 


ABC transporter, ATP binding protein 


SAG 1488 


195 


dephospho-CoA kinase 


SAG1489 


273 


formamidopyrimidine-DNA glycosylase 


SAG 1490 


282 


transcriptional regulator, MutR family 


SAG 1491 


530 


hypothetical protein 


SAG 1492 


58 


hypothetical protein 


SAG 1493 


66 


hypothetical protein 


SAG 1494 


32 


hypothetical protein 


SAG1495 


81 


CAAX amino terminal protease family protein 


SAG 1496 


110 


hypothetical protein 


SAG 1497 


37 


hypothetical protein 


SAG1498 


133 


hypothetical protein 


SAG1499 


299 


GTP-binding protein Era 


SAG 1500 


132 


diacylglycerol kinase 


SAG 1501 


161 


conserved hypothetical protein TIGR00043 


SAG1502 


268 


tetracenomycin polyketide synthesis O-methyltransferase TcmP, 
putative 


SAG 1503 


39 


hypothetical protein 


SAG1504 


38 


hypothetical protein 


SAG1505 


158 


MutT/nudix family protein 


SAG 1506 


267 


hypothetical protein 


SAG1507 


345 


PhoH family protein 


SAG1508 


590 


67 kDa Myosin-crossreactive streptococcal antigen 


SAG1509 


71 


conserved hypothetical protein 


SAG1510 


169 


peptide methionine sulfoxide reductase 
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SAG1511 


284 


conserved hypothetical protein 


SAG1512 


185 


ribosome recycling factor 


SAG1513 


242 


uridylate kinase 


SAG1514 


226 


peptide ABC transporter, ATP-binding protein 


SAG1515 


262 


peptide ABC transporter, ATP-binding protein 


SAG1516 


255 


peptide ABC transporter, permease protein 


SAG1517 


314 


peptide ABC transporter, permease protein 


SAG1518 


538 


peptide ABC transporter, peptide-binding protein 


SAG1519 


229 


ribosomal protein LI 


SAG 1520 


141 


ribosomal protein LI 1 


SAG1521 


388 


transposase, IS30 family, putative 


SAG1522 


460 


transporter, major facilitator family 


SAG 1523 


404 


peptidase, M20/M25/M40 family 


SAG 1524 


294 


transcriptional regulator, LysR family 


SAG1525 


117 


conserved hypothetical protein 


SAG 1526 


178 


IS861, transposase OrfA 


SAG1527 


277 


IS861, transposase OrfB 


SAG1528 


571 


chorismate binding enzyme 


SAG 1529 


816 


FtsK/SpoIIIE family protein 


SAG 1530 


267 


peptidyl-prolyl cis-trans isomerase, cyclophilin-type 


SAG 1531 


277 


manganese ABC transporter, permease protein 


SAG1532 


238 


manganese ABC transporter, ATP-binding protein 


SAG1533 


308 


manganese ABC transporter, manganese-binding adhesion 
liprotein 


SAG 1534 


215 


iron-dependent transcriptional regulator 


SAG1535 


229 


5-methylthioadenosine nucleosidase/S-adenosylhomocysteine 
nucleosidase 


SAG1536 


89 


conserved hypothetical protein 


SAG 153 7 


184 


MutT/nudix family protein 


SAG 153 8 


459 


UDP-N-acetylglucosamine pyrophosphorylase 


SAG 153 9 


31 


hypothetical protein 


SAG 1540 


137 


conserved hypothetical protein 


SAG 1541 


125 


glyoxalase family protein 


SAG 1542 


318 


oxidoreductase, Gfo/Idh/MocA family 


SAG 1543 


NA 


conserved hypothetical protein, authentic frameshift 


SAG 1544 


232 


gluconate 5-dehydrogenase, putative 


SAG 1545 


78 


conserved hypothetical protein 


SAG 1546 


82 


conserved hypothetical protein 


SAG 1547 


166 


acetyltransferase, GNAT family 


SAG 1548 


422 


glycosyl transferase, group 2 family protein 


SAG 1549 


127 


IS 1381, transposase OrfA 


SAG 1550 


129 


IS 1381, transposase OrfB 


SAG 1551 


67 


hypothetical protein 


SAG1552 


719 


conserved hypothetical protein 


SAG 1553 


477 


hypothetical protein 


SAG 1554 


225 


hypothetical protein 


SAG 1555 


231 


hypothetical protein 


SAG 1556 


445 


branched-chain amino acid transport system II carrier protein 
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SAG 1557 


665 


methionyl-tRNA synthetase 


SAG 1558 


291 


tellurite resistance protein TehB 


SAG1559 


231 


membrane protein, putative 


SAG 1560 


40 


hypothetical protein 


SAG 1561 


405 


PTS system, IIC component, putative 


SAG 1562 


280 


conserved hypothetical protein 


SAG 1563 


275 


exodeoxyribonuclease 


SAG1564 


118 


conserved hypothetical protein 


SAG 156 5 


158 


methylated-DNA— protein-cysteine S-methyltransferase 


SAG1566 


393 


D-isomer specific 2-hydroxyacid dehydrogenase family protein 


SAG 1567 


182 


acetyltransferase, GNAT family 


SAG1568 


NA 


phosphoserine aminotransferase, authentic frameshift 


SAG 1569 


211 


copper homeostasis protein CutC, putative 


SAG 1570 


34 


conserved hypothetical protein 


SAG 1571 


53 


hypothetical protein 


SAG 1572 


287 


tetrapyrrole methylase family protein 


SAG 1573 


108 


conserved hypothetical protein 


SAG 1574 


287 


DNA polymerase III, delta prime subunit, putative 


SAG1575 


211 


thymidylate kinase 


SAG 1576 


267 


transposase, IS30 family, putative, truncation 


SAG 1577 


219 


AcuB family protein 


SAG1578 


236 


branched-chain amino acid ABC transporter, ATP-binding protein 


SAG1579 


254 


branched-chain amino acid ABC transporter, ATP-bindine protein 


SAG 15 80 


317 


branched-chain amino acid ABC transporter, permease protein 


SAG1581 


289 


branched-chain amino acid ABC transporter, permease protein 


SAG1582 


388 


branched-chain amino acid ABC transporter, amino acid-binding 
protein 


SAG1583 


81 


conserved hypothetical protein 


SAG 1584 


377 


IS 1548, transposase 


SAG1585 


196 


ATP-dependent Clp protease, proteolytic subunit ClpP 


SAG 15 86 


209 


uracil phosphoribosyltransferase 


SAG 15 87 


389 


aminotransferase, class I 


SAG1588 


182 


RNA methyltransferase, TrmH family, group 2 


SAG1589 


450 


amino acid permease, putative 


SAG 1590 


449 


potassium uptake protein, Trk family 


SAG1591 


475 


cation uptake protein, Trk family 


SAG 1592 


83 


conserved hypothetical protein TIGR00278 


SAG1593 


240 


ribosomal large subunit pseudouridine synthase B 


SAG1594 


194 


conserved hypothetical protein TIGR0028 1 


SAG1595 


235 


conserved hypothetical protein 


SAG 1596 


246 


integrase/recombinase, phage integrase family 


SAG 1597 


157 


CBS domain protein 


SAG 1598 


173 


conserved hypothetical protein 


SAG 1599 


324 


HAM1 protein 


SAG 1600 


264 


glutamate racemase 


SAG 1601 


79 


conserved hypothetical protein 


SAG 1602 


180 


membrane protein, putative 


SAG 1603 


173 


transcriptional regulator, biotin repressor family 
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SAG 1604 


r~% f\ 

229 


membrane protein, putative 


SAG 160 5 


167 


conserved hypothetical protein 


SAG 1606 


247 


RNA methyltransferase, TrmH family 


SAG 1607 


A~\ /A<\ 

92 


acylphosphatase 


SAG 1608 


310 


lipoprotein, putative 


SAG 1609 


221 


amino acid ABC transporter, permease protein 


SAG1610 


285 


amino acid ABC transporter, substrate-binding protein 


SAG1611 


486 


amidase family protein 


SAG1612 


160 


transcription elongation factor GreA 


SAG1613 


600 


conserved hypothetical protein 


SAG1614 


167 


acetyltransferase, GNAT family 


SAG1615 


443 


UDP-N-acetylmuramate— alanine ligase 


SAG1616 


205 


conserved hypothetical protein 


SAG1617 


32 


hypothetical protein 


SAG1618 


1032 


Snf2 family protein 


SAG1619 


377 


IS 1548, transposase 


SAG 1620 


436, 


phosphoglycerate dehydrogenase-related protein 


SAG1621 


1^ A~\. 

300 


primosomal protein Dnal 


SAG 1622 


391 


conserved hypothetical protein 


SAG 1623 


159 


conserved hypothetical protein TIGR00244 


SAG 1624 


501 


sensor histidme kinase CsrS 


SAG 1625 


229 


DNA-bmding response regulator CsrR 


SAG 1626 


ill 


conserved hypothetical protein 


SAG 1627 


296 


heat shock protein HtpX 


SAG 1628 


184 


lemA protein 


SAG 1629 


237 


glucose-inhibited division protein B 


SAG 1630 


459 


sodium transport family protein 


SAG1631 


223 


potassium uptake protein, Trk family, putative 


SAG 1632 


276 


cobalt transport family protein 


SAG 1633 


558 


ABC transporter, ATP-binding protein 


SAG 1634 


212 


conserved hypothetical protein 


SAG1635 


402 


sodium:dicarboxylate symporter family protein 


SAG1636 


455 


orancned-cnam amino acid transport system II earner protein 


SAG 163 7 


351 


alcohol dehydrogenase, zmc-containing 


SAG 163 8 


230 


ABC transporter, permease protein 


SAG1639 


356 


ABC transporter, ATP-binding protein 


SAG 1640 


458 


peptidase, M20/M25/M40 family 


SAG 1641 


274 


YaeC family protein 


SAG 1642 


211 


ABC transporter, substrate-bmdmg protein 


SAG 1643 


229 


glutamme amidotransferase, class I 




^ / 


nypotneiicai protein 


SAG 1645 


238 


conserved hypothetical protein TIGR01033 


SAG 1646 


32 


hypothetical protein 


SAG 1647 


328 


dihydroxyacetone kinase family protein 


SAG 1648 


178 


transcriptional regulator, TetR family, putative 


SAG 1649 


37 


hypothetical protein 


SAG 1650 


329 


dihydroxyacetone kinase family protein 


SAG 1651 


192 


dihydroxyacetone kinase family protein 
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SAG 1652 


1 f\ A 

124 


conserved hypothetical protein 


SAG 1653 


O O ^7 

237 


glycerol uptake facilitator protein 


CI A /""^ 1 /!" C /I 

SAG 1654 


t O A 

134 


conserved hypothetical protem 


Ct A x*t 1 f £~ C 

SAG 1 655 


i-k o T 

237 


transcnptional regulator, MerR family 


SAG 1656 


369 


•11 jl j • 1 , * 

conserved hypothetical protein 


SAG1657 


83 


1 .1 1 • 1 M « 

hypothetical protein 


SAG 165 8 




conserved hypothetical protein 


SAG1659 


118 


lojap-related protein 


SAG 1660 


173 


isochorismatase family protein 


SAG1661 


195 


conserved hypothetical protein TIGR00488 


SAG 1662 


210 


conserved hypothetical protein TIGR00482 


SAG 1663 


105 


conserved hypothetical protem TIGR00253 


SAG 1664 


372 


GTP-bmdmg protein 


SAG 1665 


111 


hydrolase, haloacid dehalogenase-like family 


SAG 1666 


304 


1 A * _# * 

membrane protein, putative 


SAG 1667 


480 


glutamyl-tRNA(Gm) amidotransferase, B subumt 


SAG 1668 


488 


glutamyl-tRN A(Gln) amidotransferase, A subumt 


SAG 1669 


100 


glutamyl-tRNA(Gm) amidotransferase, C subumt 


SAG 1670 


881 


j i i > i *i • 
pyruvate phosphate dikmase 


SAG1671 


s-\ f—j y 
276 


protem of unknown function 


SAG 1672 


170 


CBS domain protein 


ct a /'— t -t y"Ti 

SAG 1673 


321 


^ 1 i 1 v>~i aiii r»*t >• 

3 -hydroxyacyl-Co A dehydrogenase family protein 


SAG 1674 


182 


isochonsmatase family protein 


SAG1675 


261 


transcnptional regulator CodY, putative 


SAG 1676 


403 


ammotransterase, class I 


SAG 1677 


150 


ii .1 ■ * 1 • 
conserved hypothetical protein 


SAG 1678 


460 


111 11 *1111 1°1S* *1 

hydrolase, haloacid dehalogenase-hke family 


SAG 1679 


320 


asparaginase family protein 


SAG 1680 


292 


1*1* j 1 1 1 

shikimate 5-dehydrogenase 


SAG1681 


304 


* 1 1a 1 1 i*1 -l O • -1 

oxidoreductase, aldo/keto reductase family 


SAG1682 


671 


A fl 1T*X 1 1 j T^w X A 1 1 • 

ATP-dependent DNA hehcase RecG 


SAG1683 


512 


immunogenic secreted protein, putative 


CI A /'""t -1 yo .4 

SAG 1684 


i** s~ s~ 

366 


alamne racemase 


SAG 1685 


119 


holo-(acyl-carner-protem) synthase 


SAG 1686 


335 


"1 ^ ^ _ "1^ _ ^\ J 1 . 1 ^ _ ^\ 1 1 A J 111 1 

phospho-2-dehydro-3 -deoxyheptonate aldolase 


SAG 1687 


842 


preprotein translocase, SecA subunit 


CI A /~* "1 O 

SAG 1688 


315 


mannose-6-phosphate isomerase, class I 


SAG 1689 


293 


f» j 1 • 

iructokmase 


ct A yi 1 r\f\ 

SAG 1 690 


639 


PTS system, IIABC components 


CI A /""* 1 ✓"Ci 1 

SAG1691 


>4 *~7C\ 

479 


sucrose-6-phosphate hydrolase 


l>AO 1 oyZ 




sucrose operon repressor bcrR 


SAG 1693 


144 


N utilization substance protein B 


SAG 1694 


129 


conserved hypothetical protein 


SAG 1695 


186 


translation elongation factor P 


SAG 1696 


38 


hypothetical protein 


SAG 1697 


48 


hypothetical protein 


SAG 1698 


99 


conserved hypothetical protein 


SAG 1699 


30 


hypothetical protein 
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SAG1 /UU 


76 


hypothetical protein 


c Am -ic\ 1 
SAtjrl /Ul 


56 


hypothetical protein 


SAG1 /Uz 


41 


hypothetical protein 


SAG1 IK) 5 


54 


hypothetical protein 


O A 1 TA /I 

SAG 1704 


150 


cytidme/deoxycytidylate deaminase family protein 


SAG 170 5 


\T A 

NA 


peptidase, M24 family, authentic point mutation 


SAG 1706 


238 


conserved hypothetical protein 


SAG 1707 


499 


drug resistance transporter, EmrB/QacA family 


SAG 1 70 8 


38 


i ^1 j * i j • 
hypothetical protein 


SAG 1709 


942 


excmuclease ABC, A subunit 


SAG1710 


223 


conserved hypothetical protein 


SAG1711 


314 


magnesium transporter, CorA family 


ci a x-i 1 t 1 ^ 

SAG1712 


79 


nbosomal protein S 1 8 


SAG1713 


163 


smgle-strand binding protein 


CI A 1 T 1 A 

SAG1714 


95 


*1 1 j ■ /^« y 

nbosomal protein S6 


O A /"*" 1 HI C 

SAG1715 


374 


a " £~ j * i i 

A/G-specific adenine glycosylase 


SAG1716 


197 


transcnptional regulator, Cro/CI family 


SAG1717 


1 A A 

104 


thioredoxm 


O A 1 T1 O 

SAG1718 


166 


PAP2 family protein 


o a i *7i n 

SAG1719 


779 


MutS2 family protein 


SAG 1720 


1 on 

180 


conserved hypothetical protein 


C A Z"" 1 1 TT 1 

SAG 1721 


i a o 

103 


conserved hypothetical protein 


SAG 1722 


297 


*1 1 TTTYT 

nbonuclease HDI 


CI A / — \ t r^r\ 

SAG 1723 


197 


signal peptidase I 


CI A V~1 1 /I 

SAG 1724 


o f\ s~ 

806 


hehcase, putative 


Ci A ✓ — i -i i~t r 

SAG 1725 


160 


1 1 il 4 ' t • 

conserved hypothetical protein 


SAG 1726 


364 


TVKT AJ *J*11 j" -r^k 

DNA-damage-mducible protein P 


CI A /"""I 1 T"> T 

SAG 1 727 


770 


formate acetyltransferase 


CI A / — l -1 T~\ O 

SAG 1728 


124 


FMN -binding protein 


SAG 1729 


309 


conserved hypothetical protein 


SAG1730 


251 


1 1 j1 j * 1 j • 

conserved hypothetical protein 


CI A /*"** 1 T1 1 

SAG1731 


298 


membrane protein, putative 


SA(jrl /32 


2o2 


glycerol uptake facilitator protein, putative 


SAG1 1 55 


IdO 


universal stress protein family 


SAG1 /34 


400 


transporter, putative 


c Am 71 
SA<jrl /30 


ziy 


transcnptional regulator, Grp/Fnr family 


O API 71 /T 

SAG1 /3o 


761 


X-pro dipeptidyl-peptidase 


C A Z" 1 1 7Q *7 

SAG1 /3 / 


1 1 n 
1 19 


hypothetical protein 


c Am HI Q 
oAXjl 1 DO 




polyprenyl synthetase ramily protein 


o a 1 Tin 
SAG 173 9 


582 


ABC transporter, ATP-bmding protein CydC 


SAOI 740 


57? 


uctiibpurier, i\ i jr-uinaing proiem i^yau 


SAG 1741 


339 


cytochrome d ubiquinol oxidase, subunit II 


SAG 1742 


475 


cytochrome d oxidase, subunit I 


SAG 1743 


402 


pyridine nucleotide-disulphide oxidoreductase family protein 


SAG 1744 


299 


prenyltransferase, UbiA family 


SAG1745 


148 


hypothetical protein 


SAG 1746 


35 


hypothetical protein 


SAG 1747 


99 


conserved hypothetical protein TIGR00103 
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SAG1748 


396 


cyclopropane-fatty-acyl-phospholipid synthase 


SAG1749 


241 


transcriptional regulator, merR family 


SAG1750 


195 


exonuclease 


SAG1751 


178 


conserved hypothetical protein 


SAG1752 


390 


conserved hypothetical protein TIGR00275 


SAG1753 


260 


conserved hypothetical protein 


SAG 1754 


89 


ribosomal protein S14 


SAG1755 


38 


hypothetical protein 


SAG 1756 


341 


conserved hypothetical protein 


SAG 1757 


336 


O-sialoglycoprotein endopeptidase family protein 


SAG1758 


135 


ribosomal-protein-alanine acetyltransferase, putative 


SAG 1759 


230 


protein of unknown function 


SAG 1760 


76 


conserved hypothetical protein 


SAG 1761 


559 


metallo-beta-lactamase superfamily protein 


SAG 1762 


169 


conserved hypothetical protein 


SAG1763 


448 


glutamine synthetase, type I 


SAG1764 


123 


transcriptional regulator GlnR 


SAG1765 


179 


conserved hypothetical protein 


SAG 1766 


398 


phosphoglycerate kinase 


SAG 1767 


289 


acid phosphatase 


SAG1768 


336 


glyceraldehyde 3-phosphate dehydrogenase 


SAG 1769 


692 


translation elongation factor G 


SAG1770 


156 


ribosomal protein S7 


SAG 1771 


137 


ribosomal protein S12 


SAG1772 


270 


pur operon repressor 


SAG 1773 


313 


HD domain protein 


SAG 1774 


424 


conserved hypothetical protein 


SAG1775 


210 


conserved hypothetical protein 


SAG 1776 


220 


ribulose-phosphate 3-epimerase 


SAG1777 


290 


conserved hypothetical protein TTGR00157 


SAG 1778 


283 


rRNA (guanine-Nl -) -methyl transferase, putative 


SAG1779 


290 


dimethyladenosine transferase 


SAG1780 


163 


hypothetical protein 


SAG1781 


186 


primase-related protein 


SAG1782 


260 


deoxyribonuclease, TatD family 


SAG 1783 


90 


hypothetical protein 


SAG1784 


130 


hypothetical protein 


SAG 1785 


430 


hypothetical protein 


SAG1786 


130 


protein of unknown function 


SAG1787 


420 


dltD protein 


SAG1788 


79 


D-alanyl carrier protein 


SAG1789 


421 


dltB protein 


SAG 1790 


511 


D-alanine-activating enzyme 


SAG 1791 


395 


sensor histidine kinase 


SAG1792 


224 


DNA-binding response regulator 


SAG 1793 


44 


ribosomal protein L34 


SAG 1794 


451 


membrane protein, putative 


SAG1795 


388 


transposase, IS30 family, putative 
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SAG 1796 


575 


ammo acid ABC transporter, permease protein 


SAG 1797 


407 


ammo acid ABC transporter, ATP-bmdmg protem 


SAG1798 


39 


hypothetical protein 


SAG1799 


792 


xylulose-5 -phosphate/lructose-6-phosphate phosphoketolase 


SAG1 800 


363 


conserved hypothetical protem 


SAG1801 


559 


transcriptional antitermmator, BglG family 


SAG 1802 


253 


conserved hypothetical protein 


SAG 1803 


505 


carbohydrate kinase, FGGY family 


SAGl 804 


329 


i i j * i * * 

hypothetical protem 


SAGl 805 


483 


PTS system, IIC component, putative 


SAGl 806 


318 


glyoxylate reductase, NADH-dependent 


SAGl 807 


339 


hypothetical protein 


SAGl 808 


327 


1*1* j * * * i ^ ^ 

sugar binding transcriptional regulator, LacI family 


SAG 1809 


215 


transaldolase family protein 


SAG1810 


238 


lilt* A H"^^. ^^r^ A • ^ 

carbohydrate isomerase, AraD/FucA family 


SAGl 811 


y-\ M 

287 


i i i i j * • 

hexulose-6-phosphate isomerase, putative 


SAG1812 


221 


i i >»■ 1 t » >i . • 

hexulose-6-phosphate synthase, putative 


SAG1813 


161 


PTS system, IIA component 


SAG1814 


92 


PTS system, IIB component 


SAG1815 


479 


transport protein SgaT, putative 


SAGl 816 


205 


1 J_1 A * 1 j * 

hypothetical protem 


SAG1817 


157 


hypothetical protem 


SAG1818 


430 


11 • A ,1 « 

adenylosuccinate synthetase 


SAG1819 


340 


perfrmgolysm O regulator protem 


SAGl 820 


224 


11 a1#*1 * 

conserved hypothetical protem 


SAGl 821 


750 


glutamate— cysteine hgase/amino acid hgase, putative 


SAGl 822 


272 


protein of unknown function 


SAG 1823 


418 


protein of unknown function 


SAGl 824 


291 


chaperonin, 33 kDa 


SAG 1825 


325 


NirR3/Smml family protein 


SAG 1826 


213 


1 1 * 1 1 * * 1 i * 

deoxynucleoside kinase family protein 


SAGl 827 


163 


phosphmothncm N-acetyltransferase 


SAGl 828 


815 


ATP-dependent Clp protease, ATP-bmdmg subunit 


SAGl 829 


154 


transcnptional regulator CtsR 


SAGl 830 


153 


conserved hypothetical protein 


SAGl 831 


346 


translation elongation factor Ts 


SAGl 832 


256 


"1 1 A * 'T%0 

nbosomal protein S2 


SAGl 833 


186 


11111 *1 1j 1 • J *f~^ 

alkyl hydroperoxide reductase, subunit C 


SAGl 834 


510 


11 111 '1 1j 1 • j "I— * 

alkyl hydroperoxide reductase, subunit F 


SAG 18.35 


134 


11 it i * i j* 
conserved hypothetical protein 


SACj183o 


61 


conserved hypothetical protein 


SAGl 837 


468 


prophage LambdaSa2, lysin, putative 


SAGl 83 8 


109 


prophage Lambda Sa2, holm, putative 


SAGl 839 


136 


conserved hypothetical protein 


SAGl 840 


112 


hypothetical protein 


SAGl 841 


76 


conserved domain protein 


SAGl 842 


1224 


prophage LambdaSa2, PblB, putative 


SAGl 843 


240 


conserved hypothetical protein 



437 



WO 2004/018646 



PCT/US2003/026827 



Table 1: Complete list of GBS predicted genes 



ORF 


Size 
(a.a.) 


Annotation 

/ 

i \ 


SAG1844 


911 


conserved hypothetical protein 


SAG1845 


42 


hypothetical protein 


SAG 1846 


158 


hypothetical protein 


SAG 1847 


227 


conserved hypothetical protein 


SAG 1848 


114 


conserved hypothetical protein 


SAG 1849 


115 


hypothetical protein 


SAG1850 


101 


hypothetical protein 


SAG 1851 


111 


conserved domain protein 


SAG1852 


420 


conserved domain protein 


SAG 1853 


180 


prophage LambdaSa2, protease, putative 


SAG 1854 


380 


conserved hypothetical protein 


SAG 1855 


570 


prophage LambdaSa2, terminase large subunit, putative 


SAG 1856 


161 


hypothetical protein 


SAG 1857 


119 


prophage LambdaSa2, HNH endonuclease family protein 


SAG 1858 


95 


hypothetical protein 


SAG1859 


180 


prophage LambdaSa2, site-specific recombinase, phage integrase 
family 


SAG 1860 


154 


conserved hypothetical protein 


SAG 1861 


119 


prophage LambdaSa2, transcriptional regulator, Cro/CI family 


SAG 1862 


86 


hypothetical protein 


SAG 1863 


138 


prophage LambdaSa2, single-strand binding protein 


SAG 1864 


68 


hypothetical protein 


SAG 1865 


74 


conserved hypothetical protein 


SAG1866 


109 


conserved hypothetical protein 1 


SAG1867 


163 


conserved hypothetical protein 


SAG 1868 


134 


hypothetical protein 


SAG1869 


437 


prophage LambdaSa2, type II DNA modification 
methyltransferase, putative 


SAG1870 


273 


prophage LambdaSa2, DNA replication protein DnaC, putative 


SAG1871 


248 


prophage LambdaSa2, bacteriophage replication 
protein/hypothetical protein, truncation/fusion 


SAG1872 


200 


hypothetical protein 


SAG 1873 


443 


prophage LambdaSa2, replicative DNA helicase 


SAG 1874 


87 


hypothetical protein 


SAG1875 


94 


conserved hypothetical protein 


SAG 1876 


176 


prophage LambdaSa2, HNH endonuclease family protein 


SAG1877 


236 


prophage LambdaSa2, antirepressor protein, putative 


SAG1878 


102 


conserved domain protein 


SAG 1879 


156 


hypothetical protein 


SAG 1880 


54 


hypothetical protein 


SAG1881 


51 


hypothetical protein 


SAG1882 


120 


prophage LambdaSa2, repressor protein, putative 


SAG 1883 


128 


conserved hypothetical protein 


SAG1884 


134 


hypothetical protein 


SAG1885 


356 


prophage LambdaSa2, site-specific recombinase, phage integrase 
family 


SAG 1886 


32 


hypothetical protein 


SAG1887 


689 


Na+/H+ exchanger family protein 



438 



WO 2004/018646 



PCT/US2003/026827 



Table 1: Complete list of GBS predicted genes 



ORF 


Size 


Annotation 




(a.a.) 




SAG1888 


78 


hypothetical protein 


SAG1889 


317 


microcin immunity protein MccF, putative 


SAG 1890 


631 


endopeptidase O 


SAG1891 


327 


oxidoreductase, Gfo/Idh/Moc A family 


SAG1892 


358 


membrane protein, putative 


SAG1893 


59 


hypothetical protein 


SAG 1894 


214 


cyclic nucleotide-binding domain protein 


SAG1895 


204 


polypeptide deformylase 


SAG1896 


333 


sugar binding transcriptional regulator RegR 


SAG1897 


634 


conserved hypothetical protein 


SAG1898 


271 


PTS system, IID component 


SAG1899 


288 


PTS system, IIC component 


SAG 1900 


164 


PTS system, IIB component 


SAG1901 


398 


glucuronyl hydrolase 


SAG 1902 


144 


PTS system, IIA component 


SAG 1903 


34 


hypothetical protein 


SAG 1904 


270- 


oxidoreductase, short-chain dehydrogenase/reductase family 


SAG 1905 


212 


conserved hypothetical protein 


SAG 1906 


335 


carbohydrate kinase, PfkB family 


SAG 1907 


212 


2-dehydro-3 -deoxyphosphogluconate aldolase/4-hydroxy-2- 






oxoglutarate aldolase 


SAG 1908 


499 


hypothetical protein 


SAG 1909 


204 


nitroreductase family protein 


SAG1910 


141 


transcriptional regulator, MarR family 


SAG1911 


1468 


DNA polymerase III, alpha subunit, Gram-positive type 


SAG1912 


194 


N-acetylmuramoyl-L-alanine amidase, family 4 protein 


SAG1913 


617 


prolyl-tRNA synthetase 


SAG1914 


419 


membrane-associated zinc metalloprotease, putative 


SAG1915 


264 


phosphatidate cytidylyltransferase 


SAG1916 


250 


undecaprenyl diphosphate synthase 


SAG1917 


113 


preprotein translocase, YajC subunit 


SAG1918 


114 


bacteriocin transport accessory protein, putative 


SAG1919 


387 


malate oxidoreductase | 


SAG1920 


445 


citrate carrier protein, CCS family | 


SAG1921 


508 


sensor histidine kinase | 


SAG 1922 


229 


response regulator j 


SAG 1923 


331 


UDP-glucose 4-epimerase 


SAG 1924 


535 


glucan 1,6-alpha-glucosidase | 


SAG 1925 


377 


sugar ABC transporter, ATP-binding protein 


SAG 1926 


283 


helix-turn-helix domain protein, fis-type | 


o/\<or 1 yZ / 


Zyo 


lacA protein | 


SAG1928 


325 


tagatose 1,6-diphosphate aldolase | 


SAG1929 


310 


tagatose-6-phosphate kinase | 


SAG1930 


171 


galactose-6-phosphate isomerase, LacB subunit | 


SAG1931 


141 


galactose-6-phosphate isomerase, LacA subunit | 


SAG1932 


816 


neuraminidase-related protein | 


SAG1933 


482 


PTS system, IIC component, putative | 


SAG1934 


101 


PTS system, IIB component, putative | 
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SAG1935 


157 


PTS system, IIA component, putative 


SAG1936 


258 


lactose phosphotransferase system repressor 


SAG 1937 


NA 


streptococcal histidine triad family protein, degenerate 


SAG1938 


307 


adhesion lipoprotein 


SAG1939 


147 


protein of unknown function TIGR00256 


SAG1940 


738 


GTP pyrophosphokinase family protein 


SAG 1941 


800 


2 y ,3 ' -cyclic-nucleotide T -phosphodiesterase 


SAG 1942 


151 


nrdl protein 


SAG1943 


345 


conserved hypothetical protein 


SAG 1944 


165 


conserved hypothetical protein 


SAG1945 


345 


iron ABC transporter, iron-binding protein 


SAG 1946 


257 


DNA-binding response regulator 


SAG1947 


549 


conserved hypothetical protein 


SAG1948 


275 


PTS system, 111) component 


SAG 1949 


269 


PTS system, IIC component 


SAG 1950 


163 


PTS system, IIB component 


SAG 1951 


141 


PTS system, IIA component, putative 


SAG1952 


353 


membrane protein, putative 


SAG1953 


60 


hypothetical protein 


SAG1954 


384 


membrane protein, putative 


SAG1955 


282 


ABC transporter, ATP-binding protein 


SAG1956 


96 


conserved hypothetical protein, truncation 


SAG1957 


250 


response regulator 


SAG1958 


276 


conserved hypothetical protein 


SAG 1959 


727 


PTS system, IIABC components 


SAG 1960 


551 


sensor histidine kinase 


SAG 1961 


225 


phosphate regulon response regulator PhoB 


SAG 1962 


218 


phosphate transport system regulatory protein PhoU, putative 


SAG 1963 


253 


phosphate ABC transporter, ATP-binding protein 


SAG 1964 


292 


phosphate ABC transporter, permease protein 


SAG 1965 


281 


phosphate ABC transporter, permease protein 


SAG 1966 


293 


hemolysin precursor, putative 


SAG 1967 


195 


hypothetical protein 


SAG1968 


246 


conserved hypothetical protein TIGR00046 


SAG 1969 


317 


ribosomal protein LI 1 methyltransferase 


SAG 1970 


102 


conserved hypothetical protein 


SAG 1971 


41 


hypothetical protein 


SAG 1972 


238 


transcriptional regulator, MerR family 


SAG 1973 


156 


acetyltransferase, GNAT family 


SAG 1974 


152 


MutT/nudix family protein 


SAG 1975 


47 


hypothetical protein 


SAG 1976 


156 


conserved hypothetical protein 


SAG 1977 


163 


acetyltransferase, GNAT family 


SAG 1978 


422 


ATPase, AAA family 


SAG 1979 


253 


membrane protein, putative 


SAG 1980 


300 


ABC transporter, ATP-binding protein 


SAG 1981 


68 


hypothetical protein 


SAG 1982 359 


transcriptional regulator, Cro/CI family 
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SAG 1983 


105 


conserved hypothetical protein 


SAG1984 


188 


conserved hypothetical protein TIGR00730 


SAG1985 


51 


hypothetical protein 


SAG1986 


375 


site-specific recombinase, phage integrase family 


SAG 1987 


61 


conserved hypothetical protein 


SAG 1988 


342 


conserved hypothetical protein 


SAG1989 


139 


hypothetical protein 


SAG 1990 


127 


hypothetical protein 


SAG 1991 


204 


transcriptional regulator, Cro/CI family 


SAG 1992 


518 


protein of unknown function 


SAG 1993 


373 


site-specific recombinase, phage integrase family 


SAG 1994 


108 


conserved hypothetical protein 


SAG 1995 


210 


hypothetical protein 


SAG1996 


263 


cell wall surface anchor family protein, putative 


SAG 1997 


182 


hypothetical protein 


SAG1998 


457 


hypothetical protein 


SAG 1999 


47 


hypothetical protein 


SAG2000 


666 


membrane protein, putative 


SAG2001 


756 


conjugal transfer protein, interruption-C 


SAG2002 


129 


IS 1381, transposase OrfB 


SAG2003 


127 


IS 1381, transposase OrfA 


SAG2004 


67 


conjugal transfer protein, interruption-N 


SAG2005 


136 


conserved hypothetical protein 


SAG2006 


88 


conserved hypothetical protein 


SAG2007 


317 


conserved hypothetical protein 


SAG2008 


84 


conserved hypothetical protein 


SAG2009 


88 


conserved hypothetical protein 


SAG2010 


157 


hypothetical protein 


SAG2011 


160 


conserved hypothetical protein 


SAG2012 


90 


hypothetical protein 


SAG2013 


189 


hypothetical protein 


SAG2014 


449 


hypothetical protein 


SAG2015 


99 


transcriptional regulator, Cro/CI family 


SAG2016 


125 


hypothetical protein 


SAG2017 


429 


transcriptional regulator, Cro/CI family 


SAG2018 


553 


FtsK/SpoIIIE family protein 


SAG2019 


153 


hypothetical protein 


SAG2020 


98 


hypothetical protein 


SAG2021 


826 


cell wall surface anchor family protein 


SAG2022 


417 


transposase, ISL3 family 


SAG2023 


546 


mercuric reductase 


SAG2024 


130 


mercuric resistance operon regulatory protein MerR 


SAG2025 


522 


Mn2+/Fe2+ transporter, NRAMP family 


SAG2026 


240 


membrane protein, putative 


SAG2027 


205 


ABC transporter, ATP-binding protein 


SAG2028 


36 


conserved hypothetical protein 


SAG2029 


284 


streptomycin resistance protein 


SAG2030 


130 


hypothetical protein 
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S AG203 1 


202 


hypothetical protein 


SAG2032 


111 


conserved hypothetical protein 


SAG2033 


162 


acetyltransferase, GNAT family 


SAG2034 


247 


membrane protein, putative 


SAG2035 


300 


ABC transporter, ATP-binding protein 


SAG2036 


68 


hypothetical protein 


SAG2037 


358 


transcriptional regulator, Cro/CI family 


SAG2038 


204 


PAP2 family protein 


SAG2039 


98 


conserved hypothetical protein 


SAG2040 


186 


conserved hypothetical protein TIGR00730 


SAG2041 


287 


protease, putative 


SAG2042 


100 


rhodanese family protein 


SAG2043 


255 


c AMP factor 


SAG2044 


62 


hypothetical protein 


SAG2045 


179 


DNA topology modulation protein FlaR, putative 


SAG2046 


361 


glycerol dehydrogenase, putative 


SAG2047 


235 


conserved hypothetical protein 


SAG2048 


614 


5-methyltetrahydrofolate— homocysteine methyltransferase, 
putative 


SAG2049 


745 


5-methyltetrahydropteroyltriglutamate-homocysteine 
methyltransferase 


SAG2050 


107 


conserved hypothetical protein 


SAG2051 


230 


branched-chain amino acid transport protein AzlC, putative 


SAG2052 


41 


hypothetical protein 


SAG2053 


1570 


serine protease, subtilase family, putative 


SAG2054 


228 


DNA-binding response regulator 


SAG2055 


462 


sensor histidine kinase 


SAG2056 


202 


chromosome assembly-related protein 


SAG2057 


833 


leucyl-tRNA synthetase 


SAG2058 


415 


major facilitator family protein 


SAG2059 


281 


protein of unknown function 


SAG2060 


398 


glycosyl transferase, family 8 


SAG2061 


401 


glycosyl transferase, family 8 


SAG2062 


179 


transcription antitermination protein NusG 


SAG2063 


630 


pathogenicity protein, putative 


SAG2064 


57 


preprotein translocase, SecE subunit, putative 


SAG2065 


50 


ribosomal protein L33 


SAG2066 


773 


penicillin-binding protein 2A 


SAG2067 


294 


ribosomal large subunit pseudouridine synthase, RluD subfamily 


SAG2068 


546 


conserved hyppthetical protein 


SAG2069 


403 


phosphopentomutase 


SAG2070 


223 


deoxyribose-phosphate aldolase 


SAG2071 


400 


Na+ dependent nucleoside transporter 


SAG2072 


259 


uridine phosphorylase 


SAG2073 


245 


transcriptional regulator, GntR family | 


SAG2074 


540 


60 kda chaperonin | 


SAG2075 


94 


chaperonin, 10 kDa | 


SAG2076 


267 


ABC transporter, ATP-binding protein | 
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SAG2077 


298 


ABC transporter, permease protem 


SAG2078 


320 


protem of unknown functon/lipoprotein, putative 


SAG2079 


265 


hydrolase, haloacid dehalogenase-like family 


SAG2080 


286 


glyoxalase family protein 


SAG2081 


243 


conserved hypothetical protein 


SAG2082 


1 205 


anaerobic nbonucleoside-tnphosphate reductase activating protein 


SAG2083 


i 163 


acetyltransferase, GNAT family 


SAG2084 


310 


virulence factor MviM, putative 


SAG2085 


1 47 


conserved hypothetical protein 


SAG2086 


723 


anaerobic ribonucleoside-triphosphate reductase 


SAG2087 


495 


membrane protein, putative 


SAG2088 


40 


hypothetical protein 


SAG2089 


105 


conserved hypothetical protein 


S AG2090 


136 


conserved hypothetical protein TIGR00250 


SAG2091 


88 


conserved hypothetical protein 


SAG2092 


132 


conserved hypothetical protein 


SAG2093 


379 


recA protem 


SAG2094 


1 NA 


competence/damage-inducible protein CinA, authentic frameshift 


SAG2095 


183 


DNA-3-methyladenme glycos}>lase I 


SAG2096 


196 


Holhday junction DNA helicase RuvA 


SAG2097 


418 


transporter, putative 


SAG2098 


659 


DNA mismatch repair protein HexB 


SAG2099 


33 


hypothetical protem 


SAG2100 


67 


cold shock protein, CSD family 


SAG2101 


858 


DNA mismatch repair protein HexA 


SAG2102 


145 


arginine repressor ArgR, putative 


SAG2103 1 


563 


arginyl-tRNA synthetase 


SAG2104 


102 


conserved hypothetical protein 


SAG2105 


290 


conserved hypothetical protein 


SAG2106 


-4 -A* 

314 


conserved hypothetical protein 


SAG2107 | 


MAT XX A 

5.83 


aspartyl-tRNA synthetase 


SAG2108 


426 


histidyl-tRNA synthetase 


SAG2109 | 


60 


nbosomal protein L32 


SAG2110 


49 


nbosomal protein L33 


SAG2111 i 


1 

173 


conserved hypothetical protein 


SAG2112 


A t\ A 

494 


site-specific recombmase, phage integrase family 


O A 1 1 O I 

SAG2113 | 


82 


conserved hypothetical protem 


C A O 1 1 ,1 J 

SAG2114 


342 


conserved hypothetical protein 


SAG2115 | 


4 in 

143 


i lit* i j * — — i — 
hypothetical protein 


SAG2116 | 


151 


conserved hypothetical protem 


C A CYJ 117 1 


/ 1 


nypotneticai protein 


SAG2118 


306 


transcriptional regulator, Cro/CI family 


SAG2119 


373 


conserved domain protein 


SAG2120 


269 


hypothetical protein 


SAG2121 


223 


hypothetical protein 


SAG2122 


223 


DNA-binding response regulator 


SAG2123 


454 


sensor histidine kinase 


SAG2124 | 


517 


membrane protein, putative 
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SAG2125 


308 


carbamate kinase 


SAG2126 


332 


ornithine carbamoyltransferase 


SAG2127 


431 


sensor histidine kinase 


SAG2128 


277 


response regulator 


SAG2129 


240 


amino acid ABC transporter, ATP-binding protein 


SAG2130 


504 


amino acid ABC transporter, amino acid-binding protein/permease 
protein 


SAG2131 


847 


membrane protein, putative 


SAG2132 


247 


conserved hypothetical protein 


SAG2133 


118 


conserved hypothetical protein 


SAG2134 


772 


membrane protein, putative 


SAG2135 


179 


transcriptional regulator, TetR family, putative 


SAG2136 


98 


conserved hypothetical protein 


SAG2137 


203 


ribosomal protein S4 


SAG2138 


95 


conserved hypothetical protein 


SAG2139 


451 


replicative DNA helicase 


SAG2140 


150 


ribosomal protein L9 


SAG2141 


660 


DHH family protein 


SAG2142 


613 


glucose inhibited division protein A 


SAG2143 


203 


membrane protein, putative | 


SAG2144 


373 


tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase | 


SAG2145 


222 


L-serine dehydratase, iron-sulfur-dependent, beta subunit | 


SAG2146 


290 


L-serine dehydratase, iron-sulfur-dependent, alpha subunit | 


SAG2147 


234 


protein of unknown function/lipoprotein, putative | 


SAG2148 


179 


LysM domain protein | 


SAG2149 


264 


cobalt transport family protein | 


SAG2150 


280 


ABC transporter, ATP-binding protein | 


SAG2151 


279 


ABC transporter, ATP-binding protein | 


SAG2152 


180 


CDP-diacyljglycerol-glycerol-3 -phosphate 3- | 
phosphatidyltransferase | 


SAG2153 


427 


peptidase, Ml 6 family j 


SAG2154 


414 


conserved hypothetical protein | 


SAG2155 


117 


conserved hypothetical protein | 


SAG2156 


369 


recF protein | 


SAG2157 


278 


transporter, putative j 


SAG2158 


220 


transcriptional regulator, Cro/CI family j 


SAG2159 


493 


inosine-5-monophosphate dehydrogenase | 


SAG2160 


161 


transcriptional regulator, ArgR family | 


SAG2161 


226 


transcriptional regulator, Crp/Fnr family | 


SAG2162 


234 


conserved hypothetical protein | 


C A fll 1 /CO 

o ALrZ 1 o J 


410 


argmme deimmase | 


SAG2164 


136 


acetyltransferase, GNAT family \ 


SAG2165 


337 


ornithine carbamoyltransferase | 


SAG2166 


475 


arginine/oraithine antiporter | 


SAG2167 


318 


carbamate kinase | 


SAG2168 


341 


tryptophanyl-tRNA synthetase 


SAG2169 


230 


membrane protein, putative 


SAG2170 


290 


conserved hypothetical protein 
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ORF 


Size 
(a.a.) 


Annotation 


SAG2171 


539 


ABC transporter, ATP-binding protein 


SAG2172 


859 


ABC transporter, permease protein, putative 


SAG2173 


159 


conserved hypothetical protein TIGR00246 


SAG2174 


409 


serine protease 


SAG2175 


257 


partitioning protein, ParB family 
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ORF 


Size 
(aa) 
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Peptide 


Sortase 
motif 


Lipo- 
protein 
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Western 
blot 


FACS 


GBS 
specific 


Annotation 


SAG0017 


447 


+ 














pcsB 


SAG0031 


299 


+ 














peptidase, M23/M37 family 


SAG0032 


434 


+ 








+ 


4- 




group B streptococcal surface immunogenic protein 


SAG0034 


438 


+ 








+ 


+ 




sugar ABC transporter, sugar-binding protein 


SAG0051 


126 


+ 








+ 


+ 




MORN motif family protein 


SAG0079 


212 








+ 


+ 


4- 




adenylate kinase 


SAGO086 


85 






+ 








4* 


lipoprotein, putative 


SAG0093 


250 


+ 








+ 


+ 




D-alanyl-D-alanine carboxypeptidase family protein 


SAG0094 


191 


+ 














N-acetylmuramoyl-L-alanine amidase, family 4 protein 


SAG0108 


308 


+ 














conserved hypothetical protein 


SAG0114 


322 


+ 




+ 










ribose ABC transporter, periplasmic D-ribose-binding 
protein 


SAG0124 


356 


+ 














sensor histidine kinase 


SAG0132 


294 


+ 








+ 


4- 




SPFH domain/Band 7 family protein 


SAGO 134 


96 


+ 












+ 


hypothetical protein 


SAG0146 


395 


+ 














penicillin-binding protein 4, putative 


SAG0147 


411 
















D-alanyl-D-alanine carboxypeptidase family protein 


SAG0148 


551 






+ 




+ 


- 




oligopeptide ABC transporter, substrate-binding protein, 
putative 


SAG0166 


123 


+ 














conserved domain protein 


SAGO 1 76 


94 
















conserved hypothetical protein 


SAG0187 


542 


+ 




+ 




+ 


4- 




oligopeptide ABC transporter, oligopeptide-binding 
protein 


SAG0206 


60 






+ 








+ 


lipoprotein, putative 


SAG0213 


39 


+ 














hypothetical protein 


SAG0231 


135 


+ 














hypothetical protein 


SAG0242 


308 






+ 




4- 


- 




amino acid ABC transporter, amino acid-binding protein 


SAG0245 


152 






+ 




4- 




+ 


protein of unknown function/lipoprotein, putative 


SAG0255 


315 


+ 














conserved hypothetical protein 


SAG0257 


53 






+ 








+ 


lipoprotein, putative 


SAG0265 


235 


+ 








4- 




4- 


conserved hypothetical protein 


SAG0290 


270 










+ 






ABC transporter, substrate-binding protein 


SAG0298 


750 


4- 














penicillin-binding protein 1A 
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SAG0306 


535 


+ 














KH domain protein 


SAG0321 


339 


+ 














sensor histidine kinase, putative 


SAG0329 


106 


+ 














PTS system, cellobiose-specific IIB component 


SAG0368 


435 
















protein of unknown function 


SAG0371 


167 














+ 


hypothetical protein 


SAG0383 


334 


+ 








+ 


- 




protein of unknown function/Iipoprotein, putative 


SAG0392 


521 


+ 


+ 






+ 


+ 




cell wall surface anchor family protein 


SAG0394 


345 
















sensor histidine kinase 


SAG0405 


347 






+ 




+ 


+ 




protein of unknown function/Iipoprotein, putative 


SAG0406 


299 


+ 














UTP-glucose- 1 -phosphate uridy ly Itransferase 


SAG0407 


338 


+ 














glycerol-3 -phosphate dehydrogenase (NAD(P)+) 


SAG0416 


1233 




+ 








+ 




protease, putative 


SAG0421 


1055 




+ 


— — 


— — — 


+ 


- 




cell wall surface anchor family protein 


SAG0433 


1389 




+ 












surface protein Rib 


SAG0437 


123 






+ 










lipoprotein, putative 


SAG0451 


149 


+ 




+ 








+ 


bacteriocin transport accessory protein,putative 


SAG0455 


357 
















conserved hypothetical protein 


SAG0472 


126 










+ 


- 




rhodanese-like family protein 


SAG0482 


84 
















YGGT family protein 


SAG0499 


275 








+ 








hemolysin A 


SAG0503 


279 










+ 






lipase/acylhydrolase 


SAG0504 


200 


+ 














conserved hypothetical protein 


SAG0506 


65 














+ 


hypothetical protein 


SAG0521 


236 


+ 














carboxymethylenebutenolidase-related protein 


SAG0535 


506 












+ 




zinc ABC transporter, zinc-binding adhesion liprotein 


SAG0596 


670 








+ 








prophage LambdaSal, pblA protein, internal deletion 


SAG0603 


111 








+ 








conserved hypothetical protein 


SAG0604 


239 








+ 








prophage LambdaSal, lysin, putative 


SAG0617 


439 
















sensor histidine kinase VncS 


SAG0624 


574 


+ 














septation ring formation regulator EzrA, putative 


SAG0629 


354 
















conserved domain protein 


SAG0635 


245 


+ 














acid phosphatase, class B 


SAG0638 


109 
















cell wall surface anchor family protein, interruption-N 
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SAG0645 


554 




+ 






+ 


+ 




cell wall surface anchor family protein 


SAG0646 


307 




+ 






4- 






cell wall surfece anchor family protein 


SAG0647 


305 


+ 














sortase family protein 


SAG0649 


890 




+ 






4- 


+ 




cell wall surface anchor family protein, putative 


SAG0658 


383 






+ 










lipoprotein, putative 


SAG0675 


171 


4- 














putative secreted protein 


SAG0676 


885 








+ 








proteinase, putative 


SAG0677 


1062 




+ 












hypothetical protein 


SAG0679 


343 


4- 




+ 




+ 


• 




protein of unknown function 


SAG0680 


339 


+ 








4- 


- 




protein of unknown function 


SAG0681 


353 


+ 














conserved domain protein 


SAG0686 


261 


+ 








+ 






DNA-entry nuclease, putative 


SAG0714 


188 


4- 












+ 


conserved hypothetical protein 


SAG0717 


266 


+ 








+ 


+ 




amino acid ABC transporter, amino acid-binding protein 


SAG0720 


449 








+ 








sensory box histidine kinase 


SAG0738 


132 


4- 














conserved hypothetical protein 


SAG0739 


143 


4- 














conserved hypothetical protein 


SAG0742 


428 








4- 




4- 




peptidase, U32 family 


SAG0755 


282 


+ 














peptidase, U32 family 


SAG0757 


129 


• 4- 




+ 




4- 


- 




protein of unknown function/lipoprotein, putative 


SAG0764 


230 








4- 


4- 


4- 




phosphoglycerate mutes e family protein 


SAG0765 


681 


4- 














penicillin-binding protein 2b 


SAG0771 


512 


4- 


+ 






+ 


+ 


+ 


cell wall surface anchor family protein 


SAG0776 


276 


4- 




+ 










YaeC family protein, putative 


SAG0777 


528 








+ 


+ 


+ 




ATP-dependent RNA helicase, DEAD/DEAH box family 


SAG0785 


330 


4- 














conserved hypothetical protein 


SAG0808 


309 


4- 




4* 




+ 


+ 




protease maturation protein, putative 


SAG0824 


417 


4- 














polysaccharide deacetylase family protein 


SAG0832 


753 


4- 








4- 


4- 




protein of unknown function 


SAG0833 


181 


4- 












+ 


hypothetical protein 


SAG0867 


63 


+ 














conserved hypothetical protein 


SAG0868 


285 


4- 








+ 






DNA-entry nuclease 


SAG0886 


319 


4- 








4- 


4- 




protein of unknown function 
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Western 
blot 
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SAG0904 


56 


+ 












+ 


hypothetical protein 


SAG0907 


877 


+ 




+ 




+ 


- 




protein of unknown function/lipoprotein, putative 


SAG0926 


333 
















Tn916, NLP/P60 family protein 


SAG0942 


185 


+ 








+ 


+ 




signal peptidase I, putative 


SAG0949 


276 


+ 




+ 




-f 


+ 




amino acid ABC transporter, amino acid-binding protein 


SAG0954 


349 










+ 


- 




protein of unknown function/lipoprotein, putative 


SAG0961 


247 


+ 










- 




sortase SrtA 


SAG0963 


320 












• 




conserved hypothetical protein 


SAG0971 


282 






+ 




+ 






protein of unknown function/lipoprotein, putative 


SAG0973 


320 














+ 


nisin-resistance protein, putative 


SAG0977 


312 
















sensor histidine kinase 


SAG0979 


553 


+ 




+ 






- 




ABC transporter, substrate-binding protein 


SAG0984 


437 


+ 














sensor histidine kinase CiaH 


SAG0992 


286 


+ 




+ 






+ 




phosphate ABC transporter, phosphate-binding protein 


SAG1007 


342 


+ 




+ 




+ 


f 




iron-compound ABC transporter, iron-compound-binding 
protein 


SAG 1014 


190 


+ 








- 


- 




conserved hypothetical protein 


SAG1018 


40 






+ 










lipoprotein, putative 


SAG 1024 


183 


+ 














lipoprotein, putative 


SAG 1029 


101 


+ 














hypothetical protein 


SAG 1030 


304 


+ 








+ 


+ 




protein of unknown function 


SAG 1037 


157 


+ 














hypothetical protein 


SAG1052 


47 














+ 


cell wall surface anchor family protein, putative 


SAG 1072 


200 


+ 












- 


conserved hypothetical protein 


SAG 1094 


278 








+ 




+ 




conserved hypothetical protein 


SAG1108 


357 


+ 










- 




spermidine/putrescine ABC transporter, 
spermidine/putrescine-binding prot. 


SAG 1121 


295 


+ 














polysaccharide deacetylase family protein 


SAG 1126 


228 


+ 










+ 




protein of unknown function 

* 


SAG 1127 


446 














+ 


conserved domain protein 


SAG 11 30 


49 


+ 












+ 


hypothetical protein 


SAG 1138 


64 


+ 














conserved hypothetical protein 


SAG 1139 


193 
















conserved hypothetical protein 
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WO 2004/018646 



PCT/US2003/026827 



Table 2 



ORF 


Size 
(aa) 
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SAG 1149 


207 


+ 




+ 










lipoprotein, putative 


SAG 11 84 


236 


+ 














conserved hypothetical protein 


SAG1186 


553 








+ 








metallo-beta-lactamase superfamily protein 


SAG 11 89 


334 


+ 














conserved hypothetical protein 


SAG 1190 


551 








+ 








adherence and virulence protein A 


SAG1197 


1072 


+ 














hyaluronidase 


SAG1201 


367 


+ 














iminodiacetate oxidase, putative 


SAG1206 


854 


+ 














conserved domain protein 


SAG1214 


58 


+ 














hypothetical protein 


SAG1216 


1252 




+ 






+ 


- 




pullulanase, putative 


SAG 1227 


198 


+ 








+ 






protein of unknown function 


SAG 1233 


822 


+ 








+ 


- 




streptococcal histidine triad family protein 


SAG 1234 


306 


+ 




+ 




+ 






laminin-binding surface protein 


SAG1238 


202 


+ 














hypothetical protein 


SAG1283 


1631 




+ 






+ 


+ 




agglutinin receptor 


SAG1313 


56 


+ 














conserved hypothetical protein 


SAG1327 


409 


+ 














sensor histidine kinase 


SAG 1331 


979 


+ 


+ 












R5 protein 


SAG 1333 


690 


+ 


+ 






+ 


+ 




5-nucleotidase family protein 


SAG 1350 


544 
















surface antigen-related protein 


SAG1361 


414 


+ 






■ 








conserved hypothetical protein 


SAG1371 


392 


+ 














conserved hypothetical protein 


SAG1393 


310 






+ 










iron compound ABC transporter, substrate-binding protein 


SAG 1404 


308 


+ 


+ 












cell wall surface anchor family protein 


SAG 1405 


294 
















sortase family protein 


SAG 1406 


293 
















sortase family protein 


SAG 1407 


705 


+ 


+ 






+ 


r + 




cell wall surface anchor family protein 


SAG 1408 


901 




+ 












cell wall surface anchor family protein 


SAG1419 


577 














+ 


lipoprotein, putative 


SAG 1431 


268 
















amino acid ABC transporter, amino acid-binding protein 


SAG1433 


375 


+ 














conserved hypothetical protein 


SAG1441 


415 


+ 










+ 




maltose/maltodextrin ABC transporter, 
maltose/maltodextrin-binding protein 



450 
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SAG 1462 


970 




+ 












cell wall surface anchor family protein 


SAG 1473 


192 


+ 


+ 










+ 


cell wall surface anchor family protein 


SAG 1474 


680 
















amidase family protein 


SAG1483 


78 


+ 














preprotein translocase, SecG subunit 


SAG 1488 


195 










+ 






dephospho-CoA kinase 


SAG 1491 


530 














+ 


hypothetical protein 


SAG 1508 


590 








+ 


4- 


- 




67 kDa Myosin-crossreactive streptococcal antigen 


SAG1518 


538 






+ 










peptide ABC transporter, peptide-binding protein 


SAG 1530 


267 










+ 


- 




peptidyl-prolyl cis-trans isomerase, cyclophilin-type 


SAG 1533 


308 






+ 




+ 


- 




manganese ABC transporter, manganese-binding adhesion 
Hprotein 


SAG 1544 


232 


+ 














gluconate 5-dehydrogenase, putative 


SAG 1551 


67 


+ 












+ 


hypothetical protein 


SAG 1552 


719 
















conserved hypothetical protein 


SAG 1553 


477 


+ 












+ 


hypothetical protein 


SAG 1562 


280 


+ 














conserved hypothetical protein 


SAG 1582 


388 


+ 




+ 






- 




branched-chain amino acid ABC transporter, amino acid- 
binding protein 


SAG1590 


449 








+ 


+ 


+ 




potassium uptake protein, Trk family 


SAG1601 


79 
















conserved hypothetical protein 


SAG1610 


285 






+ 




+ 


- 




amino acid ABC transporter, substrate-binding protein 


SAG1618 


1032 
















Snf2 family protein 


SAG 1624 


501 


+ 














sensor histidine kinase CsrS 


SAG 1628 


184 


+ 














lemA protein 


SAG 1631 


223 


+ 










- 




potassium uptake protein, Trk family, putative 


SAG 1641 


274 


' + 










- 




YaeC family protein 


SAG 1642 


277 






+ 




+ 






ABC transporter, substrate-binding protein 


SAG 1683 


512 
















immunogenic secreted protein, putative 


SAG 1706 


238 


+ 














conserved hypothetical protein 


SAG 1745 


148 


+ 












+ 


hypothetical protein 


SAG 1752 


390 
















conserved hypothetical protein TIGR00275 


SAG 1759 


230 












+ 




protein of unknown function 


SAG1762 


169 


+ 















conserved hypothetical protein 
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SAG 1767 


289 






+ 








i 


icid phosphatase 


SAG 1768 


336 








+ 


+ I 






glyceraldehyde 3 -phosphate dehydrogenase 


SAG 1774 


424 


+ 












< 


conserved hypothetical protein 


SAG 1786 


130 


+ 








+ 






protein of unknown function 


SAG 1787 


420 


+ 














dltD protein 


SAG 1791 


395 


+ 














sensor histidine kinase 


SAG 1822 


272 


+ 








+ 


- 




protein of unknown function 


SAG 1823 


418 








+ 


+ ! 


+ 




protein of unknown function 


SAG 1837 


468 








+ 








prophage LambdaSa2, lysin, putative 


SAG1838 


109 


+ 














prophage LambdaSa2, holin, putative 


SAG 1839 


136 


+ 














conserved hypothetical protein 


SAG 1842 


1224 








+ 








prophage LambdaSa2, PblB, putative 


SAG1912 


194 


+ 














N-acetylmuramoyl-L-alanine amidase, family 4 protein 


SAG1921 


508 


+ 














sensor histidine kinase 


SAG1932 


816 


+ 














neuraminidase-related protein 


SAG1938 


307 


+ 




+ 






ft* 




adhesion lipoprotein 


SAG1941 


800 


+ 


+ 






+ 


m 




2 ' , 3 * -cycltc-nucleotide 2" -phosphodiesterase 


SAG 1945 


345 


+ 














iron ABC transporter, iron-binding protein 


SAG 1947 


549 








+ 








conserved hypothetical protein 


SAG1960 


551 








+ 


+ 






sensor histidine kinase 


SAG1966 


293 






+ 






- 




hemolysin precursor, putative 


SAG1996 


263 




+ 












cell wall surface anchor family protein, putative 


SAG 1997 


182 


+ 














hypothetical protein 


SAG 1998 


457 


+ 














hypothetical protein 


SAG2021 


826 
















cell wall surface anchor family protein 


SAG2043 


255 


+ 










[ 




cAMP factor 


SAG2053 


157C 




+ 












serine protease, subtilase family, putative 


SAG2055 


462 


; 






+ 








sensor histidine kinase 


SAG2056 


, 202 


\ + 












+ 


chromosome assembly-related protein 


SAG2063 


63C 


) + 


+ 












pathogenicity protein, putative 


SAG2078 


32( 


) + 




+ 




+ 






protein of unknown runction/lipoprotein, putative 


SAG2094 




+ 










+ 




competence/damage-inducible protein CinA, authentic 
frameshift 



452 
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SAG2121 


223 


+ 












+ 


hypothetical protein 


SAG2123 


454 
















sensor hisridine kinase 


SAG214I 


660 


+ 








+ 


* 




DHH family protein 


SAG2147 


234 






+ 






+ 




protein of unknown function/lipoprotein, putative 


SAG2148 


179 


+ 














LysM domain protein 


SAG2174 


409 


+ 














serine protease 


SAG0013 


428 


+ 














protein of unknown function 



453 
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Table 3 



ORF 


Annotation 

t 


SAG0038 


conserved hypothetical protein 


SAG0048 


transcriptional regulator Cro/CI family 


SAG0091 


transcriptional regulator ComXl putative 


SAGO 137 


conserved hypothetical protein 


SAG0686 


DNA-entry nuclease putative 


SAG0770 


membrane protein putative 


SAG0868 


DNA-entry nuclease 


SAG1 143 


conserved hypothetical protein 


SAG1233 


streptococcal histidine triad family protein 


SAG1596 


integrase/recombinase phage integrase family 


SAG1616 


conserved hypothetical protein 


SAG1721 


conserved hypothetical protein. 
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Table 5 



| Strain 


Source 


Capsular serotype 


Reference 


090 


Lancefield 


la 




515 


Houston 


la 


(1) 


A909 


Lancefield 


la 


(2) 


1 Davis 


Channing 


la 




DK1 


Houston 


la 




DK8 


Houston 


la 




H36b 


Lancefield 


lb 


(2) 


(S7) 7357b 


Channing 


lb 


(3) 


18RS21 


Lancefield 


II 


(4) 


DK21 


Houston 


II 




COH1 


Seattle 


III 


(5) 


COH31 


Seattle 


in 


(6) 


D136C 


Lancefield 


in 


(4) 


M781 


Houston 


in 


(7) 


M732 


Houston 


III 


(8) 


1169NT1 


Atlanta 


V 


(9) 


2603V/R 


Italy 


V 


This study 


CJB111 


Houston 


V 


(10) 


JM9130013 


Japan 


vni 


(11) 


SMU014 


Japan 


VIII 


(11) 


CJB110 


Houston 


Nontypeable 


(12) 
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Table 6 

Cluster 1 

SAG0230 conserved hypothetical protein 

SAG023 1 hypothetical protein 

SAG0232 hypothetical protein 

S AG023 3 hypothetical protein 

SAG0234 hypothetical protein 

SAG0235 hypothetical protein 



Cluster 2 

SAG0222 conserved domain protein 

SAG0223 conserved hypothetical protein, fusion 

SAG0225 hypothetical protein 

S AG0226 recombination protein 

SAG0227 hypothetical protein 

SAG0228 conserved hypothetical protein 

SAG0229 conserved hypothetical protein 

Cluster 3 

SAG0634 hypothetical protein 

SAG0635 acid phosphatase, class B 

SAG0636 conserved hypothetical protein 

SAG0638 cell wall surface anchor family protein, interruption-N 

SAG0640 transposase OrfA, IS3 family 
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SAG0642 
SAG0643 
SAG0644 
SAG0645 
SAG0646 
SAG0647 
SAG0648 
SAG0649 
SAG0650 
SAG0651 

Cluster 4 

SAG1898 
SAG1899 
SAG1900 
SAG1901 
SAG1902 
SAG1905 
SAG1906 

Cluster 5 

SAG0247 
SAG0248 



Table 6 

hypothetical protein 
chaperonin, 33 kDa, degenerate 
transcriptional regulator, AraC family 
cell wall surface anchor family protein 
cell wall surface anchor family protein 
sortase family protein 
sortase family protein 

4 

cell wall surface anchor family protein, putative 
sortase family protein 
protein of unknown function 



PTS system, IID component 
PTS system, IIC component 
PTS system, IIB component 
glucuronyl hydrolase 
PTS system, IIA component 
conserved hypothetical protein 
carbohydrate kinase, PfkB family 



hypothetical protein 
hypothetical protein 
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SAG0249 
SAG0674 
SAG0675 
SAG0676 
SAG0677 
SAG0680 
SAG0681 
SAG0684 
SAG1698 

Cluster 6 

SAG0261 
SAG0262 
SAG0965 
SAG0966 
SAG2002 

Cluster 7 

SAG1027 
SAG1028 
SAG1029 
SAG1030 
SAG1031 



Table 6 

hypothetical protein 

hypothetical protein 

putative secreted protein 

proteinase, putative 

hypothetical protein 

protein of unknown function 

conserved domain protein 

ABC transporter, ATP-binding protein 

conserved hypothetical protein 

IS 1 3 8 1 , transposase OrfB . 
IS1381, transposase Or£\ 
IS 1381, transposase OrfA 
IS 1381, transposase OrfB 
IS 1381, transposase OrfB 



conserved hypothetical protein 
hypothetical protein 
hypothetical protein 
protein of unknown function 
conserved domain protein 
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SAG1254 



SAG2023 



Table 6 



SAG1 032 conserved hypothetical protein 



Cluster 8 



SAG1253 transposase, ISL3 family 



mercuric reductase 



SAG1 255 mercuric resistance operon regulatory protein MerR 
SAG2022 transposase, ISL3 family 



mercuric reductase 



S AG2024 mercuric resistance operon regulatory protein MerR 



Cluster 9 



SAG 1993 site-specific recombinase, phage integrase family 

SAG1 994 conserved hypothetical protein 

SAG 1995 hypothetical protein 

S AG1 996 cell wall surface anchor family protein, putative 

S AG1 997 hypothetical protein 

SAG1998 hypothetical protein 

S AG2000 membrane protein, putative 

S AG200 1 conj ugal transfer protein, interruption-C 

SAG2007 conserved hypothetical protein 

SAG2008 conserved hypothetical protein 

SAG2009 conserved hypothetical protein 

S AG20 1 0 hypothetical protein 
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Table 6 



SAG2011 


conserved hypothetical protein 


SAG2012 


hypothetical protein 


SAG2016 


hypothetical protein 


SAG2017 


transcriptional regulator, Cro/CI family 


SAG2025 


Mn2+/Fe2+ transporter, NRAMP family 


Cluster 10 


• 


SAG1039 


conserved hypothetical protein 


SAG1447 


conserved hypothetical protein 

J ST IT 


SAG1448 


glycosyl transferase, group 1 family protein 


SAG1449 


preprotein translocase SecA subunit, putative 

r r 5 -i 


SAG1450 


conserved domain protein 


SAG1452 


conserved hypothetical protein 


SAG1453 


preprotein translocase SecY family protein 

{ 


SAG1454 


glycosyl transferase, putative 


SAG1455 

* 


glycosyl transferase, group 2 family protein 


SAG1456 


glycosyl transferase, family 8, degenerate 


SAG1459 


glycosyl transferase family 8 


SAG1460 


glycosyl transferase, family 8 


SAG1461 


conserved hypothetical protein 


SAG1462 


cell wall surface anchor family protein 


SAG1463 


* 

transcriptional regulator, RofA family, authentic point mutation 


SAG1469 


conserved hypothetical protein 
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Table 6 

S AG1 47 1 conserved hypothetical protein 
SAG1933 PTS system, IIC component, putative 



Cluster 11 

SAG0009 hypothetical protein 

SAGO 120 hypothetical protein 

SAGO 157 deoxyribonuclease-related protein, degenerate 

SAGO 1 86 hypothetical protein 

S AG02 1 6 hypothetical protein 

SAG0236 hypothetical protein 

SAG0307 hypothetical protein 

SAG0308 ABC transporter, ATP-binding protein 

S AG03 1 1 DNA-binding response regulator, authentic point mutation 

S AG05 1 8 peptide chain release factor 2, programmed frameshift 

SAG0553 hypothetical protein 

SAG0555 prophage LambdaSal , antirepressor, putative 

SAG0564 conserved hypothetical protein 

SAG0579 conserved hypothetical protein 

SAGOS 80 conserved hypothetical protein, truncation 

S AG06 1 1 transposase, degenerate 

SAG0637 transcriptional regulator, TetR family, putative, authentic frameshift 

SAG0641 Tn5252, Orf 10 protein, degenerate 

SAG0652 Tn5252, Orf 28 protein, degenerate 



467 



WO 2004/018646 



PCT/US2003/026827 



Table 6 



SAG0655 


conserved hypothetical protein 


SAG0678 


endopeptidase 0, degenerate 


SAG0683 


transmembrane protein Vexp3, putative, degenerate 


SAG0855 


glycogen biosynthesis protein GlgD, authentic irameshift 


SAG0898 


hypothetical protein 


SAG0899 


hypothetical protein 


SAG0901 


hypothetical protein 

J XT x 


SAG0902 


hypothetical protein 


SAG0903 


hypothetical protein 


SAG0917 


Tn916, hypothetical protein 


SAG0920 


Tn916, hypothetical protein 

* 


SAG0922 


Tn916, hypothetical protein 


SAG0924 


Tn9 1 6, tetM leader peptide 


SAG0928 


Tn916, hypothetical protein, authentic frameshift 


SAG0936 


Tn916, hypothetical protein 


SAG0943 


hypothetical protein 


SAG0972 


conserved hypothetical protein, authentic frameshift 


SAG1023 


hypothetical protein 


SAG1080 


hypothetical protein 


SAG1 123 


hypothetical protein 


SAG1129 


hypothetical protein 


SAG1136 


conserved hypothetical protein 


SAG1217 


conserved hypothetical protein, authentic frameshift 
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Table 6 



SAG1231 


transposase OrfB, IS3 family, degenerate 


SAG1242 

KJJL JL^J X x^ 


transDOsase OrfB IS3 familv truncation 


SAG1309 


hvnothetical nrotein 

11 V lAJ.wt.iL/Cl.X L/AV/lw/lll 


SAG1331 


R5 nrotein 


SAG! 437 


hvnnthetieal nrotein 

11 V l-'L/ UJvUwCU L/iV/tVlH 


SAG 1445 


"N/fiitT/mitlix familv nrotein fliithpntip fraTTipQhi ft 

JLVXUU 1 / 11W.L11A. ICUlllljr piUl^/lilj ClLlLllLsllLlL/ 11 CUll^/OxiJ.X.1 


SAG1484 


i 

ribosomal nrotein L33 


SAG1493 


hvnnthetieal nrotpin 

ll^J LlivU^ttl J_/lV?lV/AlA 


SAG1539 


hypothetical nrotein 


SAG1543 


conserved hvnothetical nrotein authentic frameshift 


SAG1 560 


hvnothetical nrotein 

XX T 1/ VHJ.WlfiWW'l. xyjL vv^xxx 


SAG1568 


nhosnho serine aminotransferase authentic frame<ihi ft 


SAG1570 


conserved hvnothetical nrotein 


SAG 1601 


conserved hvnothetical nrotein 


SAG 1644 

L— * JL.^h P JL V • I 


hvnothetical nrotein 


SAG1646 


hvnothetical nrotein 

X X T »— * V-' bilV IrX %rfeX XV 7 X Will 


SAG1699 


hypothetical protein 


SAG1705 


peptidase, M24 family, authentic point mutation 


SAG1708 


hypothetical protein 


SAG1 857 


prophage LambdaSa2, HNH endonuclease family protein 


SAG1 864 


hypothetical protein 


SAG1 868 


hypothetical protein 
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Table 6 



SAG1 869 prophage LambdaSa2, type II DNA modification methyltransferase, 
putative 

SAG1 872 hypothetical protein 

SAG1874 hypothetical protein 

S AG1 876 prophage LambdaSa2, HNH endonuclease family protein 

SAG1 878 conserved domain protein 

SAG1 881 hypothetical protein 

SAG1 883 conserved hypothetical protein 

S AG1 886 hypothetical protein 

SAG 1 903 hypothetical protein 

SAG1937 streptococcal histidine triad family protein, degenerate 

SAG1 97 1 hypothetical protein 

SAG1979 membrane protein, putative 

S AG1 980 ABC transporter, ATP-binding protein 

S AG1 98 1 hypothetical protein 

SAG1982 transcriptional regulator, Cro/CI family 

S AG1 983 conserved hypothetical protein 

SAG1 984 conserved hypothetical protein TIGR00730 

S AG1 985 hypothetical protein 

SAG 1991 transcriptional regulator, Cro/CI family 

SAG 1992 protein of unknown function 

SAG1 999 hypothetical protein 

SAG2004 conjugal transfer protein, interruption-N 
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SAG2039 
SAG2044 
SAG2052 
SAG2065 
SAG2094 
SAG2099 

Cluster 12 

i 

SAG1164 
SAG1165 
SAG1166 
SAG1167 
SAG1168 

Cluster 13 

SAG0581 
SAG0582 
SAG0583 
SAG0585 
SAG0586 
SAG0587 
SAG0588 
SAG0589 



Table 6 

conserved hypothetical protein 
hypothetical protein 
hypothetical protein 
ribosomal protein L33 

competence/damage-inducible protein CinA, authentic frameshift 
hypothetical protein 



glycosyl transferase CpsJ(V) 
glycosyl transferase CpsO(V) 
glycosyl transferase CpsN(V) 
polysaccharide biosynthesis protein CpsM(V) 
polysaccharide biosynthesis protein cpsH(V) 

conserved hypothetical protein 

conserved hypothetical protein 

conserved hypothetical protein 

conserved hypothetical protein 

conserved hypothetical protein 

prophage LambdaSal, structural protein, putative 

conserved hypothetical protein 

conserved hypothetical protein 
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SAG0590 
SAG0591 
SAG0593 
SAG0594 
SAG0595 
SAG0596 

Cluster 14 

SAG0915 

SAG0918 

SAG0919 

SAG0921 

SAG0925 

SAG0926 

SAG0927 

SAG0929 

SAG0930 

SAG0931. 

SAG0932 

SAG0933 

SAG0934 

SAG0935 

SAG0937 



Table 6 

conserved, hypothetical protein 

conserved hypothetical protein 

prophage LambdaSal, structural protein 

conserved hypothetical protein 

conserved hypothetical protein 

prophage LambdaSal, pblA protein, internal deletion 



Tn916, transposase 

Tn916, hypothetical protein 

Tn916, hypothetical protein 

Tn916, transcriptional regulator, putative 

Tn916, hypothetical protein 

Tn916, NLP/P60 family protein 

membrane protein, putative 

Tn916, hypothetical protein 

Tn916, hypothetical protein 

Tn9 1 6, hypothetical protein 

Tn916, transcriptional regulator, putative 

Tn9 1 6, FtsK/SpoIIIE family protein 

Tn916, hypothetical protein 

Tn916, hypothetical protein 

ABC transporter, ATP-binding protein, authentic frameshift 
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Table 6 


Cluster 15 




SAG1835 


conserved hypothetical protein 


SAG1837 


prophage LambdaSa2, lysin, putative 


SAG1839 


conserved hypothetical protein 


SAG1840 


hypothetical protein 


SAG1842 


prophage LambdaSa2 ? PblB, putative 


SAG1843 


* 

conserved hypothetical protein 


SAG1844 


conserved hypothetical protein 


SAG1849 


hypothetical protein 


SAG1851 


conserved domain protein 


SAG1852 


conserved domain protein 


SAG1853 


prophage LambdaSa2, protease, putative 


SAG1854 


conserved hypothetical protein 


SAG1855 


prophage LambdaSa2 ? terminase large subunit, putative 


SAG1856 


hypothetical protein 


SAG1858 


hypothetical protein 


SAG1859 


prophage LambdaSa2, site-specific recombinase, phage integrase family 


SAG1860 


conserved hypothetical protein 


SAG1861 


nronhaee LambdaSa2* transcrintional regulator. Cro/CI familv 


SAG1862 


hypothetical protein 


SAG1863 


prophage LambdaSa2, single-strand binding protein 


SAG1865 


conserved hypothetical protein 
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Table 6 



SAG1 866 conserved hypothetical protein 

SAG1 867 conserved hypothetical protein 

SAG1 870 prophage LambdaSa2, DNA replication protein DnaC, putative 

SAG1 87 1 prophage LambdaSa2, bacteriophage replication protein/hypothetical 
protein, truncation/fusion 

SAG1 873 prophage LambdaSa2, replicative DNA helicase 

SAG1 877 prophage LambdaSa2, antirepressor protein, putative 

SAG1879 hypothetical protein 

SAG1 882 prophage LambdaSa2, repressor protein, putative 

SAG1 884 hypothetical protein 

SAG1885 prophage LambdaSa2, site-specific recombinase, phage integrase family 



Cluster 16 



SAG1247 site-specific recombinase, phage integrase family 

SAG1250 Tn5252, relaxase 

SAG1 25 1 Tn5252, Orf 9 protein 

SAG1252 Tn5252, Orf 10 protein 

SAG1256 IS861, transposase OrfB, truncation 

SAG1257 cation-transporting ATPase, E1-E2 family 

SAG1258 cadmium efflux system accessory protein 

SAG1259 conserved hypothetical protein 

S AG1 260 hypothetical protein 

SAG 1261 conserved hypothetical protein 



474 



WO 2004/018646 PCT/US2003/026827 

Table 6 



SAG1262 


cation-transoortine ATPase. E1-E2 familv 


SAG1263 


conserved domain nrotein authentic fraineshift 


SAG1264 


Iran scHntional renressor C!onY nutative 

i«A CUIuWl JL X/ 1>XV/XAH>X X WL/A WLJkJ UJ, VWU A j UUUUII V V 


SAG1265 


eaHmiiim resistance transnorter nutative 


SAG! 266 


hvnothptical nrotein 


SAG1267 


hvnothptical nrotpfin 


SAG1268 


rpnrpssor n rot pin nntativp 

JL V/IJX t/OOUl IJXvsL-wXXX) yJ LA L CALX V t/ 


SAG 1270 


TmnR/A/fiicTl/.SJamT-i familv nrotpin 

xxxiljx-*/ xvx ixv«»x->/ ljcxxxij— » xcuxxxxv L/iuiviii 


SAG1271 


enn^ftrvpd hvnothetical nrotein 


SAG1272 


enn served hvnofhefical nrotein 

VUlXOvX V L>vX XX j LS L/UXVIX VU1 L/XULvltJ 


SAG! 273 


conserved hvnothetical nrotein 

vuxioux vvu xx y x/v ixxviivcii uxulvxxx 


L?/1VJ X X- / *T 


c*r\r\ Qf*TT\/f^f\ Vi VY^rvfhptir'al r^Tr^f p>iti 
C-wlloCl VCvJ xiy^ULlxOlx^/aJ. JJiUlClll 


SAG1276 


conserved hvnothetical nrotpin 

VVj'IIljvI V vU. XX V YJ UUXVlXvCU XrXv/l.V/XXX 


SAG1277 


hvnothetical nrotein 

Xlj A/ V/ IAAV</ LAV CIA Ulv IV/ AAA 


SAG1278 


hvnothetical nrotein 

XXV L/vLXXv IXVUX LJX LV^AXX 


SAG1279 


consprvpH domain nrotpin 

vvXIOwX V vU XlXcXX 1 X UlUlwlXl 


SAG1280 


SlNTR'? familv nrotein 

UXiX *-t XOXXXXXjr yJL ULvXIX 


SAG12R1 


hvnothetical nrotein 

xx jr jjvjixxvixv/cix ljxvsiv/XXI 


SAG1283 


agglutinin receptor 


SAG1284 


abortive infection protein AbiGI 


SAG1285 


abortive infection protein AbiGII 


SAG1286 


Tn5252, Orf28 


SAG1287 


Tn5252, Orf26 
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Table 6 



SAG1288 


Tn5252, Orf25, degenerate 


SAG1289 


Tn5252, Orf23 


SAG1290 


hvnothetical protein 


SAG1291 


Tn5252 Orf 21 protein, internal deletion 




hvnnfhetical protein 




nro tease nutative 


SAG1294 


conserved hvoothetical protein 


SAG1295 


conserved hypothetical protein 


SAG1296 


conserved hypothetical protein 


SAG1297 


C-5 eytosine-specific DNA methylase 


SAG1299 


conserved hypothetical protein 


SAG1304 


hypothetical protein 
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Table 7 



Locus 


• Annotation 


Housekeeping 




SAG0466 


thiolase 


SAG0471 


glucokinase 


SAG0492 


amino acid ABC transporter, ATP-binding protein 


SAG0767 


D-alanine— D-alanine ligase 


SAG1086 


xanthine phosphoribosyltransferase 


SAG1600 


glutamate racemase 


SAG1680 


shikimate 5-dehydrogenase 


SAG1723 


signal peptidase I 


Surface-exposed 




SAG0079 


adenylate kinase 


SAG0093 


D-alanyl-D-alanine carboxypeptidase family protein 


SAGO 163 


competence protein CglA 


SAG0290 


ABC transporter, substrate-binding protein 


SAG0368 


protein of unknown function 


SAG0503 


lipase/ acylhydrolase 




ceil wail surtace ancnor tamily protem 


SAG1552 


conserved hypothetical protein 


SAG1641 


YaeC family protein 


SAG2147 


protein of unknown function/lipoprotein, putative 


SAG2148 


LysM domain protein 
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Table 8: GBS genes shared with GAS and pneumococcus 

ORFxxxxx Annotation 

ORF0QQQ3 PcsB protein (pscB) 

ORFQ0004 ribose-phosphate pyrophosphokinase (prsA) 

ORFQ0005 aminotransferase, class I 

QRF00006 recombination protein O 

ORFQ0009 fatty acid/phospholipid synthesis protein PIsX (plsX) 

QRF00011 phosphoribosylaminoimidazole-succinocarboxamide synthase (purC) 

ORF00012 phosphoribosylformylglycinamidine synthase, putative 

ORF00013 amidophosphoribosyltransferase (purF) 

QRF00014 phosphoribosylformylglycinamidine cyclo-ligase (purM) 

ORF0Q015 phosphoribosylglycinamide formyltransferase (purN) 

ORF00020 group B streptococcal surface immunogenic protein 

ORF00021 N-acetylmannosamine-6-P epimerase, putative 

ORFQ0022 sugar ABC transporter, sugar-binding protein 

ORF0Q023 sugar ABC transporter, permease protein 

ORFQ0024 sugar ABC transporter, permease protein 

ORFQ0026 conserved hypothetical protein 

ORF00027 N-acetylneuraminate iyase, putative 

ORFQQ028 expressed ROK family protein 

ORF00030 phosphosugar-binding transcriptional regulator, RpiR family, putative 

ORF00031 phosphoribosylamine-glycine ligase (purD) 

QRF00032 phosphoribosylaminoimidazole carboxylase, catalytic subunit (purE) 

ORF0Q033 phosphoribosylaminoimidazole carboxylase, ATPase subunit (purK) 

ORFQ0036 adenylosuccinate lyase (purB) 

QRFQ0037 transcriptional regulator, Cro/Cl family 

ORF00038 Holliday junction DNA helicase RuvB (ruvB) 

ORF00039 phosphotyrosine protein phosphatase, low molecular weight 

ORF00040 MORN motif family protein 

ORF00041 membrane protein, putative 

ORF00Q43 alcohol dehydrogenase, propanol-preferring (adhP) 

ORF00045 MATE efflux family protein 

QRF00046 ribosomal protein S10 (rpsJ) 

ORF00047 ribosomal protein L3 (rpIC) 

ORF0Q048 ribosomal protein L4 (rpID) 

ORF00049 ribosomal protein L23 (rplW) 

ORF00050 ribosomal protein L2 (rplB) 

QRFQ0052 ribosomal protein S19 (rpsS) 

ORF00054 ribosomal protein L22 (rplV) 

ORF00055 ribosomal protein S3 (rpsC) 

ORF00056 ribosomal protein L16 (rplP) 

ORF00058 ribosomal protein L29 (rpmC) 

ORF00059 ribosomal protein S17 (rpsQ) 

QRFQ0060 ribosomal protein L14 (rpIN) 

ORF00061 ribosomal protein L24 (rplX) 

ORF0Q063 ribosomal protein L5 (rplE) 

ORF00065 ribosomal protein S8 (rpsH) 

ORF00066 ribosomal protein L6 (rplF) 

QRF00068 ribosomal protein L18 (rplR) 

ORF00069 ribosomal protein S5 (rpsE) 

ORFQ0070 ribosomal protein L30 (rpmD) 

ORF00071 ribosomal protein L15 (rpIO) 

ORF00072 preprotein translocase, SecY subunit 

QRF00073 adenylate kinase (adk) 

ORF00074 translation Initiation factor IF-1 (infA) 

ORF00075 ribosomal protein L36 (rpmJ) 

ORF0Q077 ribosomal protein S13 (rpsM) 
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ORF00Q78 ribosomal protein S1 1 (rpsK) 

ORF0Q08Q DNA-directed RNA polymerase, alpha subunit (rpoA) 

ORFQQ093 transcriptional regulator ComX1 , putative 

ORF00094 phosphoglycerate mutase family protein 

ORF00097 heat-inducible transcription repressor HrcA (hrcA) 

ORF00098 heat shock protein GrpE (grpE) 

ORF00099 dnaK protein (dnaK) 

QRF001QQ dnaJ protein (dnaJ) 

ORF00101 transcriptional regulator, GntR family 

ORF001Q2 tRNA pseudouridine synthase A (truA) 

ORF0Q103 phosphomethylpyrimidine kinase, putative 

ORF00104 conserved hypothetical protein 

ORF001Q5 conserved hypothetical protein 

ORF00106 conserved hypothetical protein 

ORF00107 trigger factor (tig) 

ORF00108 DNA-directed RNA polymerase, delta subunit, putative 

ORF00109 CTP synthase (pyrG) 

ORF001 1 1 deoxyuridine SMriphosphate nucleotidohydrolase (dut) 

ORF00113 carbonic anhydrase-related protein 

ORF001 15 pyridine nucleotide-disulphide oxidoreductase family protein 

ORF001 16 glutamyi-tRNA synthetase (gltX) 

ORF001 19 ribose ABC transporter, ATP-binding protein (rbsA) 

QRF00122 ribose operon repressor RbsR (rbsR) 

ORF00125 ABC transporter, ATP-binding protein 

ORF00126 DNA-binding response regulator 

QRF00128 sensor histidine kinase 

ORF00131 fructose-bisphosphate aldolase (fba) 

QRF00132 L-2-hydroxyisocaproate dehydrogenase 

QRF0Q133 ribosomal protein L28 (rpmB) 

ORF00134 conserved hypothetical protein 

ORF00135 DAK2 domain protein 

ORF00136 expressed SPFH domain/Band 7 family protein 

ORF00141 amino acid ABC transporter, ATP-binding protein 

ORF0Q142 amino acid ABC transporter, amino acid-binding protein/permease protein 

ORF00143 conserved hypothetical protein 

QRF00145 undecaprenol kinase, putative . 

QRF00146 negative regulator of competence MecA, putative 

ORF00149 ABC transporter, ATP-binding protein 

ORF0015Q conserved hypothetical protein 

QRF00151 selenocysteine lyase (csdB) 

QRF00152 NifU family protein 

ORF00153 conserved hypothetical protein 

ORFQ0155 D-alanyl-D-alanine carboxypeptidase 

ORF00158 oligopeptide ABC transporter, permease protein 

ORF00160 oligopeptide ABC transporter, ATP-binding protein 

ORF00161 oligopeptide ABC transporter, ATP-binding protein 

ORF00167 adc operon repressor AdcR (adcR) 

ORF00168 zinc ABC transporter, ATP-binding protein 

ORF00169 zinc ABC transporter, permease protein 

ORFQ0172 tyrosyl-tRNA synthetase (tyrS) 

ORF00173 penicillin-binding protein 1B, putative 

ORF00174 DNA-directed RNA polymerase, beta subunit (rpoB) 

ORFQ0176 DNA-directed RNA polymerase beta' subunit (rpoC) 

QRF0Q178 conserved hypothetical protein 

ORF00179 competence protein CgIA (cglA) 
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ORF00180 competence protein CglB (cgIB) 

ORF0Q181 conserved hypothetical protein 

ORF0Q183 conserved hypothetical protein 

ORFQ0184 acetate kinase (ackA) 

ORFQ0190 pyrroiine-5-carboxylate reductase (proC) 

ORF0Q191 glutamyl-aminopeptidase (pepA) 

ORF0Q198 single-strand binding protein (ssb) 

ORF0Q211 PTS system, IIABC components 

QRFQ0212 alpha amylase family protein 

QRF00214 transcriptional antlterminator, BgIG family 

ORF00219 PTS system, HC component, putative 

ORF00224 ribosornal protein S15 (rpsO) 

ORF00225 polyribonucleotide nucleotidyltransferase (pnp) 

ORF00227 serine Q-acetyltransferase (cysE) 

ORF0Q229 cysteinyl-tRNA synthetase (cysS) 

ORF00230 conserved hypothetical protein 

QRF0Q231 RNA methyltransferase, TrmH family, group 3 

QRFQQ233 DegV family protein 

ORF00236 ribosornal protein L13 (rplM) 

QRF00237 ribosornal protein S9 (rpsl) 

QRFQQ261 transcriptional regulator MutR family 

ORF00262 transporter, putative 

ORF00263 amino acid ABC transporter, permease protein 

ORF00264 amino acid ABC transporter, amino acid-binding protein 

ORF00265 amino acid ABC transporter, permease protein 

ORFQ0266 amino acid ABC transporter, ATP-binding protein 

ORF0Q295 N-acetylglucosamine-6-phosphate deacetylase (nagA) 

ORF00296 conserved hypothetical protein 

ORF0Q297 glycyl-tRNA synthetase, alpha subunit (gtyQ) 

ORF00299 glycyl-tRNA synthetase, beta subunit (glyS) 

ORF00300 conserved hypothetical protein 

ORF00302 glycerol kinase (glpK) 

ORF00303 alpha-glycerophosphate oxidase . 

ORF00304 glycerol uptake facilitator protein (glpF) 

ORF00306 conserved hypothetical protein 

ORF00307 transketolase (tkt) 

ORF003Q9 ABC transporter, ATP-binding protein 

ORF00310 membrane protein, putative 

ORF00313 PTS system, NBC components 

QRF0Q314 glutamate 5-kinase (proB) 

ORF00315 gamma-glutamyl phosphate reductase (proA) 

QRF00316 conserved hypothetical protein TIGR00006 

QRF00318 penicillin-binding protein 2X (pbpX) 

ORF0031 9 phospho-N-acetylmuramoyl-pentapeptide-transferase (mraY) 

QRF0032Q ATP-dependent RNA heticase, DEAD/DEAH box family 

ORFQ0321 ABC transporter, substrate-binding protein 

ORF0Q322 amino acid ABC transporter, permease protein 

ORF00323 amino acid ABC transporter, ATP-binding protein 

QRF00325 thioredoxin reductase (trxB) 

QRF00326 conserved hypothetical protein 

ORF00327 NAD synthetase (nadE) 

QRF00328 aminopeptidase C (pepC) 

ORF00329 penlclllin-blnding protein 1A(pbp1A) 

QRFQ033Q recombination protein U (recti) 

ORFQ0331 conserved hypothetical protein 
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QRF00335 conserved hypothetical protein 

ORFQQ336 conserved hypothetical protein 

ORF00337 autoinducer-2 production protein LuxS (luxS) 

QRFQ0338 KH domain protein 

ORF00348 guanylate kinase (gmk) 

ORF0Q349 DNA-directed RNA polymerase, omega subunit, putative 

QRF0035Q primosomai protein N* (priA) 

ORF00351 methionyl-tRNA formyltransferase (fmt) 

QRFQ0352 Sun protein (sun) 

ORF00353 serine/threonine phosphatase, putative 

ORF00354 serine/threonine protein kinase 

QRF00355 conserved hypothetical protein 

ORF00356 sensor histidine kinase, putative 

ORF00358 DNA-binding response regulator 

QRF0Q359 hydrolase, haloacid dehalogenase family/peptidyl-prolyl cis-trans isomerase, cyclophilin type 

ORF00360 general stress protein, putative 

ORFQ0361 pyruvate formate-lyase-activating enzyme (pfIA) 

ORF00362 transcriptional regulator, DeoR family 

QRF00363 transcriptional regulator, putative 

ORF00364 PTS system, cellobiose-specific IIA component (ceIC) 

ORF00366 PTS system, cellobiose-specific NB component (celA) 

ORF00367 PTS system, cellobiose-specific HC component (celB) 

ORFQQ368 formate acetyltransferase (pflD) 

ORF00369 transaldolase family protein 

ORF00371 glycerol dehydrogenase (gidA) 

ORF00372 cysteine synthase A (cysK) 

ORF00373 conserved hypothetical protein TIGR00257 

ORF00374 helicase, putative 

QRF00375 competence protein F, putative 

ORF00376 ribosomal subunit interface protein (yfiA) 

ORF00385 enoyl-CoA hydratase/isomerase family protein 

QRF00386 transcriptional regulator, MarR family 

ORF00387 3-oxoacyl-(acyl-carrier-protein) synthase Ml (fabH) 

ORF0Q388 acyl carrier protein (acpP) 

ORF00390 enoyl-(acyl-carrier-protein) reductase II (fabK) 

ORF00391 malonyl CoA-acyl carrier protein transacylase (fabD) 

ORF00392 3-oxoacyl-facyl-carrier protein] reductase (fabG) 

ORF00393 3-oxoacyKacyl-carrier-protein) synthase II (fabF) 

ORF00394 acetyl-CoA carboxylase, biotin carboxyi carrier protein (accB) 

ORF00395 (3R)-hydroxymyristoyKacyl-carrier-protein) dehydratase (fabZ) 

ORFQ0396 acetyl-CoA carboxylase, biotin carboxylase (accC) 

ORF00397 acetyl-CoA carboxylase, carboxyi transferase, beta subunit (accD) 

ORF00398 acetyl-CoA carboxylase, carboxyi transferase, alpha subunit (accA) 

QRF00400 seryl-tRNA synthetase (serS) 

ORFQ0403 conserved hypothetical protein 

ORF00404 PTS system, man nose-specific IIP component 

QRF00405 PTS system, mannose-specific HC component (manM) 

ORFQ0406 PTS system, mannose-specific NAB components (manL) 

ORFQ0407 hydrolase, haloacid dehalogenase-like family 

ORF00410 xanthine/uracil permease family protein 

ORF00411 conserved hypothetical protein TIGR00150, putative 

QRF00412 acetyltransferase, GNAT family 

QRF00413 expressed protein of unknown function 

ORF00415 HIT family protein (hit) 

ORF00419 ABC transporter, ATP-binding protein 
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ORFQQ421 ABC transporter, permease protein 
QRF0Q422 conserved hypothetical protein 
ORF00423 conserved hypothetical protein TIGR00091 

QRF00424 conserved hypothetical protein, POINT MUTATION 

ORFQQ425 N utilization substance protein A (nusA) 
ORFQ0426 conserved hypothetical protein 
ORFQQ427 ribosomal protein L7A family 
ORF00428 translation initiation factor IF-2 
ORFQ0429 ribosome-binding factor A (rbfA) 
ORF00432 copper-transporter ATPase CopA 
ORF00435 hydrolase, haloacid dehalogenase-like family 
ORF00436 DNA polymerase I (polA) 

ORF00437 CoA binding domain protein 

ORF00440 PNA-binding response regulator 
QRF00441 sensor histidine kinase 
ORF00443 queuine tRNA-ribosyltransferase (tgt) 
ORF00444 conserved hypothetical protein 
ORF00449 glucose-6-phosphate isomerase (pgi) 
ORF00451 rhomboid family protein 

ORF00452 expressed putative lipoprotein ~~" 
ORF00453 UTP-glucose-1 -phosphate uhdylyltransferase (galU) 
ORF00454 glycerol-3-phosphate dehydrogenase (NAD(P)+) (gpsA) 

ORF00455 ribonuclease P protein component (mpA) 

ORF00456 SpolllJ family protein __ 

ORF00458 R3H domain protein 

ORF00463 conserved hypothetical protein 

ORFQ0464 RecX protein 

ORF00465 RNA methyltransf erase, TrmA family 

ORF00470 ribonucleoside-diphosphate reductase 2, beta subunit (nrdF) 

ORF00472 ribonucleoside-diphosphate reductase 2, alpha subunit (nrdE) 
ORF00482 alcohol dehydrogenase, zinc-containing 
ORF00483 oxidoreductase, aldo/keto reductase family 
ORF00484 cation efflux system protein 
ORF00485 transcriptional regulator, TetR family 

ORF00496 conserved hypothetical protein 

ORF00500 acetyltransferase, GNAT family 

ORF00501 conserved hypothetical protein 

ORF00502 valyl-tRNA synthetase (valS) 

ORF0Q508 aspartate-ammonia ligase (asnA) 

ORF0051 1 type II DNA modification methyltransferase, putative 

ORF00513 phosphopantetheine adenylyltransferase (coaD) 

ORF00515 conserved hypothetical protein 

ORFQ0519 conserved hypothetical protein 

ORF00520 conserved hypothetical protein TIGRQQ048 

ORF0Q522 ABC transporter, ATP-binding/permease protein 
ORF00523 ABC transporter, ATP-binding/permease protein 

ORF00524 anthranilate synthase component II (trpG) 

ORF00532 endonuclease III (nth) 

ORF00534 conserved hypothetical protein 

ORF00535 glucokinase (glk) 

ORF00536 expressed protein with rhodanese domain 

ORF00537 elongation factor Tu family protein 

ORF00540 UDP-N-acetylmuramoylalanlne— D-glutamate ligase (murP) 

ORF00541 UDP-N-acetylglucosamine-N-acetylmuramyl-(pentapeptide) pyrophosphoryl-undecaprenoi N- 
acetylglucosamine transferase (rnurG) 



482 



WO 2004/018646 PCT/US2003/026827 

Table 8: GBS genes shared with GAS and pneumococcus 

ORFxxxxx Annotation 

ORF00542 cell division protein DivlB, putative 

ORF0Q544 cell division protein FtsA (ftsA) , 

ORF00545 cell division protein FtsZ (ftsZ) 

QRF00546 yimE protein, putative 

ORF00547 yimF protein (ylmF) 

ORF00549 ylmH protein (ylmH) 

QRFQ0550 cell division protein DivlVA, putative 

ORF00552 isoleucyl-tRNA synthetase (ileS) 

ORFQ0553 conserved hypothetical protein 

ORF00554 MutT/nudix family protein 

ORFQ0555 ATP-dependent Clp protease, ATP-binding subunit 

ORF0Q557 conserved hypothetical protein 

ORFQ0558 amino acid ABC transporter, permease protein 

ORF0Q559 amino acid ABC transporter, ATP-binding protein 

ORF00560 phosphoglucomutase/phosphomannomutase family protein 

ORF00562 methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase (folD) 

ORF00564 exodeoxyribonuclease VII, large subunit (xseA) 

ORFQQ566 geranyltranstransferase, putative 

ORF00567 hemolysin A 

ORF0057Q DNA repair protein RecN (recN) 

ORF00571 expressed DegV family protein 

ORF00574 DNA-binding protein HU (hup) 

ORF00576 dihydroorotate dehydrogenase A (pyrDA) 

ORF00577 beta-iactam resistance factor (fibB) 

ORF0Q578 beta-lactam resistance factor (fibA) 

ORF00579 murM protein, putative 

ORF0058Q hydrolase, haloacid dehalogenase-like family 

ORF00581 HP domain protein 

ORF00582 conserved hypothetical protein 

ORF00583 cation-transporting ATPase, E1-E2 family 

ORF00588 cell division ABC transporter, ATP-binding protein FtsE (ftsE) 

ORF00589 cell division ABC transporter, permease protein FtsX (ftsX) 

ORF00591 metallo-beta-lactamase superfamily protein 

ORF00593 DNA polymerase HI, epsilon subunit/ATP-dependent helicase DinG 

ORF00595 aspartate aminotransferase (aspC) 

ORF00596 asparaginyl-tRNA synthetase (asnS) 

ORF00601 conserved hypothetical protein 

ORF00602 conserved hypothetical protein 

ORFQ0603 conserved hypothetical protein 

ORF00605 zinc ABC transporter, zinc-binding adhesion liproteln 

QRF00606 ribosomal protein L31 (rpmE) 

ORF00607 DHH family protein 

ORF0Q609 flavodoxin 

ORF00614 ribosomal protein L19 (rplS) 

ORF00640 prophage LambdaSal, single-strand binding protein (ssb) 

ORFQQ693 DNA-binding response regulator VncR (vncR) 

ORF00694 sensor histidine kinase VncS (vncS) 

ORF0Q699 rod shape-determining protein RodA, putativen (rodA) 

ORF0070Q hydrolase, haloacid dehalogenase-like family 

QRF00701 DNA gyrase, B subunit (gyrB) __ 

ORF0Q702 septation ring formation regulator EzrA, putative 

ORF00705 conserved hypothetical protein 

ORF00706 enolase (eno) 

ORF00708 3-phosphoshikimate 1-carboxyvinyltransf erase (aroA) 

QRF0Q709 shikimate kinase (aroK) 
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I ORFQ0710 psr protein 



KDRFQ071 1 RNA methyltransf erase, TrmA family 
1ORF00729 sortase family protein 



ORF0Q731 sortase family protein 



1ORF00734 sortase family protein, FRAMES HI FT 



K)RF00743 ABC transporter, ATP-binding protein 



I QRFQ0744 membrane protein 



KDRF00745 conserved hypothetical protein 



1ORF00748 cylG protein (cylG) 



1ORF00776 DNA-entry nuclease^ putative 



1ORF00789 2-keto-3-deoxygluconate kinase 



1ORF00792 2-dehydro-3~deoxyphosphogluconate aldolase/4-hydroxy-2-oxoglutarate aldolase (eda) 
ORF00798 proline dipeptidase (pepQ) 



ORF00799 transcriptional regulator, RegM family 



1ORF00802 glycosyl transferase, group 1 family protein 



1ORF00803 threonyl-tRNA synthetase (thrS) 



K5RF00804 DNA-binding response regulator 



ORF00808 amino acid ABC transporter, permease protein 



IORF00810 amino acid ABC transporter, ATP-binding protein 



KDRF0081 1 DNA-binding response regulator 



]ORF00812 sensory box histidine kinase 



1ORFQ0813 metallo-beta-lactamase family protein 



[QRF00815 ribonuclease 111 (rnc) 



[ORF00816 expressed putative chromosome segregation SMC protein 



jORF00817 hydrolase, haloacid dehalogenase-like family 



IORFQ0818 hydrolase, haloacid dehalogenase-like family 

IORF00819 signal recognition particle-docking protein FtsY (ftsY) 



[QRF00820 ABC transporter, substrate-binding protein 



ORF00821 ABC transporter, permease protein, putative 



1ORF00824 transcriptional accessory protein Tex, putative 



QRF0Q825 conserved hypothetical protein 



IORF00828 HPr(Ser) kinase/phosphatase (hprK) 



|ORF0Q830 prolipoprotein diacylglyceryl transferase (Igt) 



1ORF00832 conserved hypothetical protein 



iORF00835 peptidase, U32 family, putative 



IORF00836 peptidase, U32 family 



IORF00837 conserved hypothetical protein 



IORF00844 lysyl-tRNA synthetase (lysS) 



[ORF0Q846 phosphoglycerate mutase family protein 



1ORFQ0847 ebsC family protein, putative 



1ORF00850 peptidase, U32 family 



IORF00855 oligoendopeptidase F, putative 



1ORFQ0856 phosphoenolpyruvate carboxylase (ppc) 



1QRF0Q859 cell division protein, FtsW/RodA/SpoVE family (ftsW) 



IORF00861 translation elongation factor Tu (tuf) 



ORF00863 triosephosphate isomerase (tpiA) 



ORF00865 phosphoglycerate mutase (gpmA) 



ORF00867 recombination protein RecR (recR) 



ORF00868 D-alanine-D-alanine ligase 



I ORF00869 UDP-N-acetylmuramoylalanyl-D-glutamyl-2,6-diaminopimelate-D-alanyl-D-alanyl ligase (murF) 
[ORF00870 oxalate:formate antiporter 



jORF00871 membrane protein, putative 



ORF00873 peptide chain release factor 3 (prfC) 



jORF00876 ABC transporter, ATP-binding protein 



ORF00880 ATP-dependent RNA helicase, DEAD/DEAH boxfamiiy 
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ORF0Q882 conserved hypothetical protein 

ORF0Q883 conserved hypothetical protein 

ORF00884 acyltransferase family protein 

ORFQ0885 competence protein CelA (celA) 

QRF00887 DNA internal ization-related competence protein ComEC/Rec2 

QRF00889 sugar-binding transcriptional regulator, Lacl famiiy 

ORFQ0892 DNA polymerase III, delta subunit, putativeD 

ORF00893 superoxide dismutase, Fe-Mn (sodA) 

ORFQ0894 transcriptional antiterminator LicT 

ORF00895 PTS system, beta-glucosides-specific HABC components 

ORF0Q896 6-phospho-beta-glucosidase (bgIA) 

ORF00899 glycerate kinase 2 (garK) 

ORF00904 S-adenosylmethionine:tRNA ribosyltransferase-isomerase (queA) 

ORF00906 glucosamine-6-phosphate isomerase (nagB) 

ORF00908 ribosomal small subunit pseudouridine synthase 

QRFQ091 1 competence protein CoiA (coiA) 

ORFQ0912 oligoendopeptidase B (pepB) 

ORFQ0914 O-methyitransferase family protein 

QRFQ0916 protease maturation protein, putative 

ORF00919 alanyl-tRNA synthetase (alaS) 

ORF00925 transcriptional regulator, Cro/CI family 

ORF00928 ribonucleoside-diphosphate reductase 2, beta subunit (nrdF) 

ORFQ0929 ribonucleoside-diphosphate reductase 2, alpha subunit (nrdE) 

ORFQ0930 ribonucleoside-diphosphate reductase 2, NrdH-redoxin (nrdH) 

ORF0Q931 phosphocarrier protein HPr (ptsH) 

ORF00932 phosphoenolpyruvate-protein phosphotransferase (ptsl) 

ORF00933 glyceraldehyde-3-phosphate dehydrogenase, NADP-dependent (gapN) 

ORF00934 polysaccharide deacetylase family protein 

ORF00935 ATP-dependent RNA helicase, DEAD/DEAH box family 

QRF00936 uridine kinase (udk) 

ORF00937 conserved hypothetical protein 

QRF00938 DNA polymerase 111, gamma and tau subunits (dnaX) 

ORF00940 biotin— acetyl-CoA-carboxylase ligase 

QRFQ0941 S-adenosylmethionine synthetase (metK) 

ORF00955 UDP-N-acetyiglucosamine 1 -carboxyvinyltransferase (murA) 

ORFQ0956 acetyltransferase, GNAT family 

ORF00957 CBS domain protein 

ORFQ0958 methionine aminopeptidase, type I (map) 

ORF00959 ribonuclease BN, putative 

QRF00962 conserved hypothetical protein 

ORF00963 DNA ligase, NAD-dependent (ligA) 

ORF00964 BrnrLi protein, putative 

QRFQ0966 pullulanase, putative 

QRF00973 ATP synthase FQ, A subunit (atpB) 

QRF00974 ATP synthase FQ, B subunit (atpF) 

ORF00975 ATP synthase F1 , delta subunit (atpH) 

ORF00976 ATP synthase F1 , alpha subunit (atpA) 

QRFQ0977 ATP synthase F1, gamma subunit (atpG) 

ORF00978 ATP synthase F1 , beta subunit (atpD) 

ORF00979 ATP synthase F1, epsilon subunit (atpC) 

ORF00981 UDP-N-acetylglucosamine 1 -carboxyvinyltransferase (murA) 

ORFQ0983 DNA-entry nuclease (endA) 

ORF00984 phenylalanyl-tRNA synthetase, alpha subunit (pheS) 

ORF0Q986 phenylalanyl-tRNA synthetase, beta subunit (pheT) 

ORFQQ988 exonuclease RexB (rexB) 
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QRFQQ989 exonuclease RexA (rexA) 

ORF0Q991 tRNA modification GTPase TrmE (trmE) 

ORFQ0992 ABC transporter, ATP-binding protein 

ORFQQ993 acetoin dehydrogenase, thymine PPi dependent, E1 component, alpha subunit 

QRF00994 acetoin dehydrogenase, thymine PPi dependent, E1 component, beta subunit 

ORF0Q995 acetoin dehydrogenase, thymine PPi dependent, E2 component, dihydrolipoaroide 

QRF00996 acetoin dehydrogenase, thymine PPi dependent, E3 component, dihydrolipoamide dehydrogenase 

ORF00997 Hpoate-protein ligase A (IplA) 

QRF00998 cobyric acid synthase, putative 

QRF00999 mur ligase family protein 

ORF01000 conserved hypothetical protein T1GRQQ159 

QRF01001 expressed protein of unknown function 

ORF01002 phosphoglucomutase/phosphomannomutase family protein 

QRF01005 oxygen-independent coproporphyrinogen 111 oxidase, putative 

ORF01006 conserved hypothetical protein 

ORF01007 hydrolase, haloacid dehalogenase-like family 

ORFQ1008 conserved hypothetical protein 

ORF01023 GTP-binding protein LepA (lepA) 

QRF01027 PilB-related protein 

QRF01030 cation-transporting ATPase, E1-E2 family 

ORFQ1Q33 conserved hypothetical protein 

ORF01040 Tn916, tetracycline resistance protein (tetM) 

ORFQ1Q57 transcriptional regulator, GntR family 

ORF01058 DNA polymerase HI, alpha subunit (dnaE) 

ORF01059 6-phosphofructokinase (pfk) 

ORF01060 pyruvate kinase (pyk) 

ORFQ1063 glucosamine--fructose-6-phosphate aminotransferase (isomerizing) (gimS) 

QRF01066 phnA protein (phnA) 

ORF01068 amino acid ABC transporter, permease protein 

ORF01069 amino acid ABC transporter, ATP-binding protein 

QRF01070 amino acid ABC transporter, amino acid-binding protein 

ORFQ1072 ribosomal protein S20 (rpsT) 

ORFQ1073 pantothenate kinase (coaA) 

ORF01074 conserved hypothetical protein 

QRFQ1075 cytidine deaminase (cdd) 

ORF01076 expressed putative lipoprotein 

ORF01077 sugar ABC transporter, ATP-binding protein 

ORFQ1Q78 sugar ABC transporter, permease protein, putative 

ORF01079 sugar ABC transporter, permease protein, putative 

QRF01080 NADH oxidase (nox-2) 

ORF01081 L-lactate dehydrogenase (Idh) 

ORF01082 DNA gyrase, A subunit (gyrA) 

ORF01083 sortase SrtA (srtA) 

ORF01089 GMP synthase (guaA) 

ORF01090 transcriptional regulator, GntR family 

QRF01091 gid protein (gid) 

ORF01093 expressed putative lipoprotein 

ORF01097 ABC transporter, ATP-binding protein 

ORF01099 DNA-binding response regulator , 

ORF01101 site-specific recombinase, phage integrase family 

ORF01106 signal recognition particle protein Ffh (ffh) 

ORF01108 conserved hypothetical protein 

6RF01 109 sensor hlstldlne kinase ClaH 

ORFQ111Q DNA-binding response regulator CiaR (ciaR) 

ORF01111 aminopeptidase N (pepN) 
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QRF01 112 phosphate transport system regulatory protein PhoU (phoLI) 

ORF011 13 phosphate ABC transporter, ATP-binding protein PstB, putative 

ORF01114 phosphate ABC transporter, ATP-binding protein PstB, putative 

ORF01 1 1 5 phosphate ABC transporter, permease protein PstA, putative 

ORF01 116 phosphate ABC transporter, permease protein 

ORF01 117 phosphate ABC transporter, phosphate-binding protein 

ORF01 118 NOL1/NOP2/sun family protein 

ORF01 119 inositol monophosphatase family protein 

ORFQ1120 conserved hypothetical protein 

QRFQ1121 conserved hypothetical protein 

ORF01122 macrolide-efflux protein mreA/riboflavin biosynthesis protein RibF 

ORF01 123 tRNA pseudouridine synthase B (truB) 

QRFQ1125 conserved hypothetical protein 

ORF01128 permease, putative 

ORFQ1129 ABC transporter, ATP-binding protein 

ORFQ1131 DNA topoisomerase I (topA) 

ORF01 132 DprA/SMF protein, putative DNA processing factor (dprA) 

ORF01 134 iron compound ABC transporter, ATP-binding protein 

ORF01 1 37 acetyitransferase, CysE/LacA/LpxA/NodL family 

ORF01138 ribonuclease Hll (rnhB) 

ORF01 139 GTP-binding protein 

ORFQ1176 carbamoyl-phosphate synthase, large subunit (carB) 

ORF01177 carbamoyl-phosphate synthase, small subunit (carA) 

ORF01178 aspartate carbamoyltransferase (pyrB) 

ORF01179 dihydroorotase, multifunctional complex type (pyrC) 

ORF01180 orotate phosphoribosyltransferase (pyrE) 

ORF01181 orotidine 5'-phosphate decarboxylase (pyrF) 

ORF01183 ABC transporter, ATP-binding protein 

QRF01184 ribonucleotide reductase, truncation 

QRF01188 cardiolipin synthetase (els) 

ORF01189 formate-tetrahydrofolate ligase (fhs) 

QRF01 190 lipoate-protein ligase A (IplA) 

ORF01198 flavoprotein-related protein 

QRF01 199 flavoprotein family protein 

QRFQ12QQ membrane protein, putative 

ORF01201 phosphoglucomutase (pgm) 

ORF01203 IS861, transposase QrfB 

ORF01205 ABC transporter, ATP-binding/permease protein 

ORF012Q6 ABC transporter, ATP-binding/permease protein 

QRF01207 conserved hypothetical protein 

ORF01208 conserved hypothetical protein , 

ORF01 209 Serine hydroxymethyltransferase 

ORF01210 Sua5/YciO/YrdC/YwlC family protein 

ORF01211 modification methylase, HernK family 

ORF01212 peptide chain release factor 1 (prfA) 

ORF01213 thymidine kinases (tdk) 

ORF01214 4-oxalocrotonate tautomerase (xylM) 

ORFQ1216 ApbE family protein 

ORF01220 xanthine permease (pbuX) 

ORF01221 xanthine phosphoribosyltransferase (xpt) 

ORF01222 guanosine monophosphate reductase (guaC) 

ORF01227 phosphate acetyitransferase 

ORF01228 rlbosomal large subunit pseudouridine synthase, RluD subfamily 

ORF01229 expressed protein of unknown function 

ORFQ1230 GTP pyrophosphokinase family protein 
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ORF01231 conserved hypothetical protein 
ORF01232 ribose-phosphate pyrophosphokinase (prsA) 
ORFQ1233 cysteine desulphurase (iscS) 
ORF01234 conserved hypothetical protein 

ORF01235 conserved hypothetical protein ~ 

QRF01236 DNA repair protein RadC (radC) 

QRF01238 6-phospho-beta-glucosidase (ascB) 

ORF01239 platelet activating factor, putative 

ORF01240 hydrolase, haloacid dehalogenase-like family 

QRF01242 voltage-gated chloride channel family protein 

ORF01243 spermidine/putrescine ABC transporter, sperm idine/putrescine-binding protein (potD) 

ORF01244 spermidine/putrescine ABC transporter, permease protein (potC) 

ORF01245 spermidine/putrescine ABC transporter, permease protein (potB) 

ORF01246 spermidine/putrescine ABC transporter, ATP-binding protein (potA) 
QRF01247 UDP-N-acetylenolpyruvoylglucosamine reductase (murB) 

ORF01248 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase (folK) 

ORF01250 dihydropteroate synthase (folP) 

ORF01251 GTP cyclohydrolase I (folE) 

ORF01252 folylpolyglutamate synthase (folC) 

ORF01259 aldehyde dehydrogenase family protein 

ORF01260 membrane protein 

ORFQ1274 gls24 protein, putative 

ORF01276 gls24 protein, putative 

ORF01279 conserved hypothetical protein 

ORF01282 ATP-dependent DNA helicase PcrA (pcrA) 

ORFQ1283 conserved hypothetical protein, FRAMESHIFT 

ORF01284 uracil permease (uraA) 

ORF01285 sodium :alanine symporter family protein 

ORF01286 cation efflux family protein 

ORF01290 ribosomal protein S1 (rpsA) 

ORF01292 branched-chain amino acid aminotransferase (ilvE) 

ORF01294 DNA topoisomerase IV, A subunit (parC) 

QRF01295 DNA topoisomerase IV, B subunit (parE) 

ORF01296 membrane protein, putative 

ORFQ1297 uracil-DNA glycosylase (ung) 

ORF01317 transcriptional regulator, LysR family, putative 

ORF01319 purine nucleoside phosphorylase (deoD) 

ORFQ1321 purine nucleoside phosphorylase (deoD) 

QRF01323 phosphopentomutase (deoB) 

ORF01324 ribose 5-phosphate isomerase (rpiA) 

QRF01327 tributyrin esterase (estA) 

ORFQ1328 metallo-beta-lactamase superfamily protein 

QRF01329 ABC transporter, ATP-binding protein 

ORF01330 ABC transporter, permease protein 

QRFQ1331 conserved hypothetical protein 

ORF01332 adherence and virulence protein A (pavA) 

ORFQ1335 TPR domain protein 

QRF01336 membrane protein 

ORF01338 mutator MutT protein (mutX) 

ORFQ1339 hyaluronidase , 

ORF01343 iminodiacetate oxidase, putative 

ORFQ1344 conserved hypothetical protein TIGR00486 

ORF0134S conserved hypothetical protein 

QRF01346 DNA replication protein Dnad, putative 

ORF01347 adenine phosphoribosyltransferase (apt) 



488 



WO 2004/018646 PCT/US2003/026827 

Table 8: GBS genes shared with GAS and pneumococcus 

ORFxxxxx Annotation 

QRFQ135Q single-stranded-DNA-specific exonuclease RecJ (recJ) 

ORF01351 oxidoreductase, short chain dehydrogenase/reductase family 

ORF01352 metallo-beta-lactamase superfamily protein 

ORF01353 conserved hypothetical protein 

ORF01354 GTP-binding protein HflX (hflX) 

ORF01355 tRNA delta(2)-isopentenylpyrophosphate transferase (miaA) 

QRF01357 exfoliative toxin A, putative 

ORF01358 pullulanase, putative 

QRFQ1362 conserved hypothetical protein 

QRF01363 peptidase, M20/M25/M4Q family 

ORF01364 nitroreductase family protein 

QRF01367 excinuclease ABC, C subunit (uvrC) 

ORF01380 streptococcal histidine triad family protein 

ORF01381 laminin-binding surface protein (Imb) 

ORF01397 Tn5252, relaxase 

ORFQ1403 mercuric reductase (merA) 

ORF014Q6 1S861, transposase OrfB, truncation 

ORF01407 cation-transporting ATPase, E1-E2 family 

ORF0141 1 conserved hypothetical protein 

ORF01412 cation-transporting ATPase, E1-E2 family 

ORF01415 transcriptional repressor CopY, putative 

ORF01416 cadmium resistance transporter, putative 

ORF01451 C-5 cytosine-specific PNA methylase 

ORF01453 conserved hypothetical protein 

ORF01455 ribosomal protein L7/L12 (rplL) 

ORF01456 ribosomal protein L10 (rplJ) 

ORF01458 ATP-dependent Clp protease, ATP-binding subunit 

ORFQ1467 GTP-binding protein (cgpA) 

ORF01468 ATP-dependent Clp protease, ATP-binding subunit CIpX (dpX) 

ORF01470 dihydrofolate reductase (folA) 

ORF01471 thymidylate synthase (thyA) 

ORF01472 HMG-CoA synthase 

ORF01473 3-hydroxy-3-methylglutaryt-CoA reductase 

QRF01474 conserved hypothetical protein 

QRF01475 hemolysin III, putative 

ORF01476 conserved hypothetical protein TIGRQ0147 

ORF01479 isopentenyl-diphosphate delta-isomerase 

ORF0148Q phosphomevalonate kinase 

ORF01481 diphosphomevalonate decarboxylase (mvaD) 

ORF01482 mevalonate kinase, putative 

QRF01484 DNA-binding response regulator 

ORF01491 polypeptide deformylase, putative 

ORF01495 ABC transporter, ATP-binding/permease protein 

ORF01496 ABC transporter, ATP-binding/permease protein 

ORF01498 ABC transporter, ATP-binding protein 

QRF01499 polyA polymerase family protein 

ORF01500 DegV family protein 

ORF01501 expressed protein of unknown function 

QRF01504 PTS system, fructose specific HABC components 

ORF015Q5 1-phosphofructokinase (fruK) 

ORF01506 lactose phosphotransferase system repressor (lacR) 

ORF01507 beta-lactam resistance factor 

QRF0151 1 pyridine nucleotlde-disulphide oxidoreductase family protein 

ORF01512 tRNA (guanine-NI)-methyltransferase (trmD) 

QRF01513 16S rRNA processing protein RimM (rimM) 
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ORF01 51 5 transcriptional regulator, RofA family 

ORF01516 KH domain protein 

ORF01517 ribosomal protein S16 (rpsP) 

ORF01518 permease, putative 

ORF01519 ABC transporter, ATP-binding protein 

ORF0152Q conserved hypothetical protein 

ORF01523 carbamoyl-phosphate synthase, small subunit (carA) 

ORF01524 pyrimldine operon regulatory protein (pyrR) 

ORF01525 ribosomal large subunit pseudouridine synthase, RluD subfamily 

ORF01526 lipoprotein signal peptidase (IspA) 

QRFQ1 527 transcriptional regulator, LysR family 

ORF01528 ribosomal protein L27 (rpmA) 

QRF01 529 conserved hypothetical protein 

ORF0153Q ribosomal protein L21 (rplU) 

ORF01531 conserved hypothetical protein, FRAMESHIFT 

ORF01532 thiamine biosynthesis protein Thil (thil) 

QRFQ1533 cysteine desulphurase (iscS) 

ORF01536 glutathione reductase (gor) 

QRFQ1537 conserved hypothetical protein 

QRF01 538 chorismate synthase (aroC) 

ORF01539 3-dehydroquinate synthase (aroB) 

QRF01540 3-dehydroquinate dehydratase (aroP) 

ORF01541 conserved hypothetical protein 

ORF01543 ribosomal protein L2Q (rpIT) 

QRF01544 ribosomal protein L35 (rpml) 

QRF01545 translation initiation factor IF-3 (infC) 

ORF01546 cytidylate kinase (crnk) 

ORF01548 ferredoxin, 4Fe-4S 

ORF01550 peptidase t (pepT) 

ORF01551 polysaccharide biosynthesis protein, putative 

ORF01552 UDP-N>acetylmuramoylalanyl~D-glutamate-2,6-diaminopimelate Hgase (murE) 

ORF01553 iron compound ABC transporter, ATP-binding protein (fepC) 

ORF01555 iron compound ABC transporter, permease protein 

QRF01556 iron compound ABC transporter, permease protein 

ORF01558 inorganic pyrophosphatase, manganese-dependent (ppa) 

ORF01559 pyruvate formate-lyase-activating enzyme (pfIA) 

ORF01 560 CBS domain protein 

ORFQ1561 conserved hypothetical protein 

QRF01564 PAP2 family protein 

ORF01565 membrane protein, putative 

ORFQ1567 expressed sortase family protein 

ORF01568 sortase family protein 

ORF01571 rogB protein FRAMESHIFT (rogB) 

ORF01587 conserved hypothetical protein 

ORF01589 RNA polymerase sigma-70 factor (rpoD) 

QRF0159Q DNA primase (dnaG) 

ORF01591 large conductance mechanosensitive channel protein (mscL) 

ORF01592 ribosomal protein S21 (rpsU) 

QRF01594 amino acid ABC transporter, amino acid-binding protein 

ORF01598 rhodanese family protein 

ORF01602 glycogen phosphorylase (glgP) 

ORF01603 4-_alpha-glucanotransferase (malQ) 

ORF01804 maltose operon repressor MaIR, putative 

QRFQ1605 maltose/maltodextrin ABC transporter, maltose/maltodextrin-binding protein 

ORFQ16Q6 maltose ABC transporter, permease protein 
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jORF01607 maltose ABC transporter, permease protein 
ORF01614 preprotein translocase SecA subunit, putative" 
QRF01619 preprotein translocase SecY family protein 
IORF01634 excinuclease ABC, B subunit (uvrB) 



1ORF01636 glutamine ABC transporter, glutamine-binding protein/permease protein (glnP) 

1ORF01637 glutamine ABC transporter, ATP-binding protein, GinQ putative 

ORF0164Q GTP-binding protein, GTP1/Obg family (obg) 
IORF01646 amidase family protein 



ORF01647 ribosomal small subunit pseudouridine synthase A (rsuA) 



ORF01648 oxidoreductase, aldo/ketb reductase family 
[ORF01651 lactoylglutathione lyase (gloA) 



|ORF01652 glycosyl transferase, group 2 family protein 
[ORF01654 SsrA-binding protein (smpB) 



IORF01655 exoribonuclease, VacB/Rnb family (vacB) 



IORF01657 preprotein translocase, SecG subunit 



[ORF01658 multi-drug resistance protein 



1ORF01662 dephospho-CoA kinase 

ORF01663 formamidopyrimidine-DNA glycosylase (mutM) 
ORF01 677 GTP-binding protein Era (era) ~~ 



[ORFQ1678 diacylglycerol kinase (dgkA) 



[ORF01679 conserved hypothetical protein T1GRQ0043 
1ORF01685 PhoH family protein"" 



[ORF01687 conserved hypothetical protein 



[ORF01689 conserved hypothetical protein 



QRF01690 ribosome recycling factor (frr) 



1ORF01691 uridylate kinase (pyrH) 



1ORF01693 peptide ABC transporter, ATP-binding protein FRAMESHIFT 
IORFQ1697 ribosomal protein L1 (rplA) 



ORF01698 ribosomal protein L11 (rplK) 



i ORFQ1706 IS861, transposase OrfB 



1QRF017Q7 chorismate binding enzyme 



1 ORF01708 FtsK/SpolHE family protein 



ORF017Q9 peptidyl-prolyl cis-trans isomerase, cyclophilin-type 



1ORF01710 manganese ABC transporter, permease protein 

1 ORF0171 1 manganese ABC transporter, ATP-binding protein 

ORF01712 manganese ABC transporter, manganese-binding adhesion liprotein 

ORF01713 iron-dependent transcriptional regulator 

ORF01714 5-methylthioadenosine nucleosidase/S-adenosylhomocysteine nucleosidase (pfs) 

ORF01716 MutT/nudix family protein 

|ORF01718 UDP-N-acetylgiucosamine pyrophosphorylase (glmli) 

ORF01722 oxidoreductase, Gfo/ldh/MocA family 

1QRF01725 gluconate 5-dehydrogenase, putative 

ORF01726 conserved hypothetical protein" 



1ORF01738 branched-chain amino acid transport system II carrier protein (brnQ) 
lORFQj 739 methionyl-tRNA synthetase (metG) 

jORF01745 exodeoxyribonuclease (exoA) 

1ORF01746 conserved hypothetical protein" 



[ORF01752 copper homeostasis protein CutC, putative 
[ORF01 755 tetrapyrrole methylase family protein 



IORF01756 conserved hypothetical protein 



QRF01758 DNA polymerase 111, delta prime subunit, putative 

IORF01759 thymidylate kinase (tmk) 

ORF01773 ATP-dependent Clp proteaso, proteolytic subunit ClpP (clpP) 



1ORF01774 uracil phosphoribosyltransferase (upp) 
ORF01 777 RNA methyltransferase, TrmH family, group 2 
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QRF01781 conserved hypothetical protein TIGR00278 
ORFQ1782 ribosomal large subunit pseudouridine synthase B (rluB) 
ORF01783 conserved hypothetical protein T1GR00281 

QRFQ1784 conserved hypothetical protein 

QRF01785 integrase/recombinase, phage integrase family 

QRF01786 CBS domain protein 

ORF01787 conserved hypothetical protein 

QRF01788 HAM1 protein 

ORF01789 glutamate racemase (murl) 

QRFQ1791 membrane protein, putative 

ORF01792 transcriptional regulator, biotin repressor family 

QRFQ1793 membrane protein, putative 

QRF01795 RNA methyltransferase, TrmH family 
QRF01796 acylphosphatase 

DRFQ1797 lipoprotein, putative 

ORFQ1799 amino acid ABC transporter, permease protein 

QRF01801 amidase family protein 

QRFQ1802 transcription elongation factor GreA (greA) 

QRF01 803 conserved hypothetical protein 

ORF01 804 acetyltransferase, GNAT family 

QRF01805 UDP-N-acetylmuramate--alanine ligase (murC) 

QRF01806 conserved hypothetical protein 

ORF01808 expressed putative helicase 

ORF01811 phosphoglycerate dehydrogenase-related protein 
ORF01812 primosomal protein Dnal (dnal) 

ORF01813 conserved hypothetical protein 

ORFQ1814 conserved hypothetical protein T1GR00244 
ORF01815 sensor histidine kinase CsrS (csrS) 
ORF01816 DNA-binding response regulator CsrR (csrR) 

ORFQ1817 conserved hypothetical protein 

ORF01818 heat shock protein HtpX (htpX) 

QRF01820 lemA protein (lemA) 

ORF01821 glucose-inhibited division protein B (gidB) 

ORF01822 sodium transport family protein 

ORF01823 potassium uptake protein, Trk family, putative 

ORF01825 ABC transporter, ATP-binding protein 

QRF01828 branched-chain amino acid transport system II carrier protein (brnQ) 

ORF01829 alcohol dehydrogenase, zinc-containing (adh) 

ORF01830 ABC transporter, permease protein 

ORF01831 ABC transporter, ATP-binding protein 

ORF01833 expressed YaeC family protein 

ORF01834 ABC transporter, substrate-binding protein 

ORF01835 glutamine amidotransferase, class I 

ORF01837 conserved hypothetical protein TIGRQ1033 

QRF01 846 glycerol uptake facilitator protein (glpF) 

ORF01849 conserved hypothetical protein 

ORFQ1851 conserved hypothetical protein 

ORF01852 iojap-related protein 

ORF01854 conserved hypothetical protein T1GR00488 

ORFQ1855 conserved hypothetical protein T1GR00482 

ORF01856 conserved hypothetical protein T1GRQ0253 

QRF01857 GTP-binding protein 

ORF01 668 hydrolaaa, haloaold dahalogenaae-llka family 

ORF0186Q glutamyi-tRNA(Gln) amidotransferase, B subunit (gatB) 

QRF01861 glutamyltRNA(Gln) amidotransferase, A subunit (gatA) 
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QRFQ1862 glutamyl-tRNA(Gln) amidotransferase, C subunit (gatC) 

ORF01867 isochorismatase family protein 

QRFQ1869 transcriptional regulator CodY, putative 

ORFQ1870 aminotransferase, class I 

ORF01871 universal stress protein family FRAMESHIFT 

ORFQ1872 hydrolase, haloacid dehalogenase-like family 

ORF01873 asparaginase family protein 

QRF01874 shikimate 5-dehydrogenase (aroE) 

ORFQ1876 ATP-dependent DNA helicase RecG (recG) 

QRF01878 alanine racemase (air) 

ORF01879 holo-(acyl-carrier-protein) synthase (acpS) 

ORFQ1881 preprotein translocase, SecA subunit (secA) 

ORF01882 mannose-6-phosphate isomerase, class I (manA) 

ORF01883 fructokinase (scrK) 

ORF01885 PTS system MABC components 

ORF01886 sucrose-6-phosphate hydrolase (scrB) 

ORF01887 sucrose operon repressor ScrR (scrR) 

ORF01888 N utilization substance protein B (nusB) 

ORF01889 conserved hypothetical protein 

ORF01890 translation elongation factor P (efp) 

ORFQ1900 cytidine/deoxycytidylate deaminase family protein 

ORF01906 excinuclease ABC, A subunit (uvrA) 

QRFQ1907 conserved hypothetical protein 

ORF01908 magnesium transporter, CorA family (corA) 

QRF01909 ribosomal protein S18 (rpsR) 

ORF01910 single-strand binding protein (ssb) 

ORF01911 ribosomal protein S6 (rpsF) 

ORF01912 A/G-specific adenine glycosylase (mutY) 

ORF01914 thioredoxin (trx) 

ORF01915 PAP2 family protein 

ORF01916 MutS2 family protein 

ORF01917 conserved hypothetical protein 

ORF01918 conserved hypothetical protein 

ORF01919 ribonuclease Hill (rnhC) 

ORF01920 signal peptidase I 

ORF01921 helicase, putative 

ORF01923 DNA-damage inducible protein P (dinP) 

ORF01924 formate acetyltransferase (pfID) 

ORF01926 conserved hypothetical protein 

ORF01927 proteinase, putative, degenerate, FRAMESHIFT 

ORF01929 glycerol uptake facilitator protein, putative 

QRF01930 universal stress protein family 

ORF01933 X-pro dipeptidyl-peptidase (pepX) 

ORF01937 ABC transporter, ATP-binding protein CydC (cydC) 

ORF01938 ABC transporter, ATP-binding protein CydD 

ORF01 945 conserved hypothetical protein T1GRQQ1 03 

ORF01948 exonuclease 

ORF01949 conserved hypothetical protein 

ORFQ195Q conserved hypothetical protein T1GRQ0275 

QRF01952 ribosomal protein S14 (rpsN) 

ORFQ1957 O-sialoglycoprotein endopeptidase family protein 

ORF01958 ribosomal-protein-alanine acetyltransferase, putative 

ORF01 960 expressed protein of unknown function 

ORF01961 conserved hypothetical protein 

ORF01962 metallo-beta-lactamase superfamily protein 
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I ORF01963 conserved hypothetical protein 
QRF01964 glutamine synthetase, type I (glnA) 
ORF01965 transcriptional regulator GinR (glnR) 
ORFQ1967 conserved hypothetical protein 
ORF01969 phosphoglycerate kinase (pgk) 
ORF01971 glyceraldehyde 3-phosphate dehydrogenase (gap) 
ORF01972 translation elongation factor G (fusA) 
ORF01973 ribosomal protein S7 (rpsG) 
ORF01974 ribosomal protein S12 (rpsL) 
ORF01975 pur operon repressor (purR) 
ORF01976 HP domain protein 
ORF01977 conserved hypothetical protein 
ORF01978 conserved hypothetical protein 
ORF01979 ribulose-phosphate 3-epimerase (rpe) 
ORF01980 conserved hypothetical protein TIGR00157 
ORF01983 dimethyladenosine transferase (ksgA) 
ORF01985 primase-related protein 
ORF01987 deoxyribonuclease, TatD family 
ORF01992 dltD protein (dltD) 
ORF01993 P-alanyl carrier protein (dltC) 
ORF01994 dltB protein (dltB) 
QRF01996 P-alanine-activating enzyme (dltA) 
ORF01997 sensor histidine kinase 
ORF01998 PNA-binding response regulator 
QRF01999 ribosomal protein L34 (rpmH) 
pRF02Q04 amino acid ABC transporter, ATP-binding protein 
pRF02007 conserved hypothetical protein 
pRF02008 transcriptional antiterminator, BglG family 
ORF02Q17 sugar binding transcriptional regulator, Lad family 
QRF0201 8 transaldolase family protein 
ORF02019 carbohydrate isomerase, AraP/FucA family 

pRF02020 hexulose-6-phosphate isomerase, putative ~ 
ORF02021 hexulose-6-phosphate synthase, putative 
ORF02022 PTS system, MA component 
ORFQ2023 PTS system, MB component 
ORF02024 transport protein SgaT, putative 
ORF02027 adenylosuccinate synthetase (purA) 
ORFQ2033 chaperonin, 33 kPa (hslQ) 
QRF02034 NifR3/Smm1 family protein 

ORFQ2037 ATP-dependent Clp protease, ATP-binding subunit 

ORF02038 transcriptional regulator CtsR (ctsR) 

QRF02040 translation elongation factor Ts (tsf) 

QRF02041 ribosomal protein S2 (rpsB) 

ORF02043 alkyl hydroperoxide reductase, subunit F (ahpF) 

ORF02076 prophage LambdaSa2, single-strand binding protein (ssb) 

ORF02082 prophage LambdaSa2, type II PNA modification methyitransferase, putative 

ORF02086 prophage LambdaSa2, replicative PNA helicase (dnaC) 

ORF02104 endopeptidase O (pepO) 

ORF021 10 polypeptide deformylase (def) 

ORF021 1 1 sugar binding transcriptional regulator RegR (regR) 

ORF021 12 conserved hypothetical protein 

ORF02113 PTS system, IIP component 

ORF021 14 PTS »y»tern, liC component 

ORF02115 PTS system, MB component 

1ORF02116 glucuronyi hydrolase 
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QRF02118 PTS system, HA component 

QRF02120 oxidoreductase, short-chain dehydrogenase/reductase family < 
ORF02121 conserved hypothetical protein 
ORFQ2122 carbohydrate kinase, PfkB family 

ORF02123 2-dehydro-3-deoxyphosphogluconate aldolase/4-hydroxy-2-oxoglutarate aldolase (eda) 
ORFQ2127 DNA polymerase HI, alpha subunit, Gram-positive type 

ORF021 29 prolyl-tRNA synthetase (proS) ' 

ORF02130 membrane-associated zinc metalloprotease, putative 

ORF02131 phosphatidate cytidylyltransferase (cdsA) 

ORF02132 undecaprenyl diphosphate synthase (uppS) 

ORFQ2133 preprotein translocase, YajC subunit (yajC) 

ORF02140 glucan 1 ,6-alpha-glucosidase (dexB) 

ORF02141 sugar ABC transporter, ATP-binding protein (msmK) 

ORF02142 helix-turn-helix domain protein, fis-type "~ 

ORF02144 tagatose 1 ,6-diphosphate aldolase (iacD) 

ORFQ2145 tagatose-6-phosphate kinase (lacC) 

ORF02146 ga)actose-6-phosphate isomerase, LacB subunit (lacB) 

ORF02147 galactose-6-phosphate isomerase, LacA subunit (lacA) 

ORF02149 PTS system, HC component, putative 

ORFQ2150 PTS system, MB component, putative 

ORF02152 PTS system, HA component, putative 

ORF02153 lactose phosphotransferase system repressor (lacR) 

ORF02157 adhesion lipoprotein 

ORF02158 expressed protein of unknown function TiGR00256 

ORF02159 GTP pyrophosphokinase (relA) 

ORFQ2161 nrdl protein (nrdi) 

ORFQ2164 iron ABC transporter, iron-binding protein 

ORF02165 DNA-binding response regulator 

ORF02167 PTS system, IIP component 

ORF02168 PTS system, HC component 

ORF02174 ABC transporter, ATP-binding protein 

ORF021 76 response regulator 

ORFQ2177 conserved hypothetical protein ~~ 
ORF02178 PTS system, HABC components 

ORF02179 sensor histidine kinase 

ORF02180 phosphate regulon response regulator PhoB (phoB) 
ORF02182 phosphate ABC transporter, ATP-binding protein (pstB) 
ORFQ2183 phosphate ABC transporter, permease protein 
QRF02184 phosphate ABC transporter, permease protein 
ORF02188 conserved hypothetical protein T1GR00046 
ORF02189 ribosomal protein L11 methyltransferase (prmA) 

ORF021 97 conserved hypothetical protein 

ORF021 99 ATPase, AAA family 

ORF02249 mercuric reductase (merA) 

ORF02272 DNA topology modulation protein FlaR, putative 
ORF02273 glycerol dehydrogenase, putative 

ORF02281 DNA-binding response regulator 

ORF02285 leucyl-tRNA synthetase (leuS) 

ORFQ229Q transcription antitermination protein NusG (nusG) 

ORF02293 penicillin-binding protein 2A (pbp2A) 

ORF02294 ribosomal large subunit pseudouridine synthase, RluD subfamily 
ORF02296 phosphopentomutase (deoB) 
ORF022Q7 deoxyribose-phoaphato aldolase (deoC) 

QRF023QQ uridine phosphorylase (udp) 

ORF02302 60 kda chaperonin (groEL) 
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I QRF02303 chaperonin, 10 kDa (groES) 
1ORF023Q5 ABC transporter, ATP-binding protein 

| O RF023Q6 ABC transporter, permease protein 



jORF02307 expressed putative lipoprotein 



[ORF02309 giyoxalase family protein 



IORF02310 conserved hypothetical protein 



IORF02311 anaerobic ribonucleoside-triphosphate reductase activating protein (nrdG) 
ORF02312 acetyitransferase, GNAT family 

IORFQ2315 anaerobic ribonucleoside-triphosphate reductase (nrdD) 
ORF02318 conserved hypothetical protein 
1ORF02320 conserved hypothetical protein 



IORF02321 conserved hypothetical protein 
IORF02322 recA protein (recA) ~ 



[ORFQ2325 DNA-3-methyladenine glycosylase I (tag) 



[ORF02327 Holliday junction DNA helicase RuvA (ruvA) 



QRFQ2329 DNA mismatch repair protein HexB (hexB) 



|ORFQ2333 arginine repressor ArgR, putative 



[ORF02334 arginyl-tRNA synthetase (argS) 



IORF02337 conserved hypothetical protein 



IORF02338 conserved hypothetical protein 



IORF02339 aspartyl-tRNA synthetase (aspS) 



[ORF02340 histidyl-tRNA synthetase (hisS) 



[QRF02342 ribosomal protein L33 (rpmG) 



[QRF02357 DNA-binding response regulator 
IORF02359 membrane protein, putative 



IORF02360 carbamate kinase (arcC) 



ORF02361 ornithine carbamoyltransferase (argF) 



IORF02364 amino acid ABC transporter, ATP-binding protein 



1ORF02365 amino acid ABC transporter, permease and amino acid-binding protein 



IORF02370 membrane protein, putative 



IORF02371 transcriptional regulator, TetR family, putative 



iORF02373 ribosomal protein S4 (rpsD) 



1ORF02374 conserved hypothetical protein 



IORF02375 replicative DNA helicase (dnaC) 



[ORF02376 ribosomal protein L9 (rpll) 



IORF02377 DHH family protein 



[ORF02378 glucose inhibited division protein A (gidA) 



ORF02380 tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase (trmU) 
1ORF02381 L-serine dehydratase, iron-sulfur-dependent, beta subunit (sdhB) 



|ORF02382 L-serine dehydratase, iron-sulfur-dependent, alpha subunit (sdhA) 

1ORF02385 cobalt transport family protein 
ORF02386 ABC transporter, ATP-binding protein 

| ORF02387 ABC transporter, ATP-binding protein, FRAMESHIFT 

lORF02388 CDP-diacylglycerol-glycerol-3-phosphate 3-phosphatidyltransferase (pgsA) 



[ORF02389 peptidase, M16 family 



ORF02390 conserved hypothetical protein 



[QRF02391 conserved hypothetical protein 



1ORF02392 recF protein (recF) 



ORF02396 inosine-S-monophosphate dehydrogenase (guaB) 



[ORF02397 transcriptional regulator, ArgR family 



IORF02400 arginine deiminase (arcA) 



IORF02402 ornithine carbamoyltransferase (argF) 



ORF02404 carbamate kinase (arcC) 



ORF024Q5 tryptophanyl-tRNA synthetase (trpS) 



1ORFQ2407 conserved hypothetical protein 
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ORFQ2408 ABC transporter, ATP-binding protein _ 

ORFQ2409 ABC transporter, permease protein, putative 

ORFQ241Q conserved hypothetical protein T1GR00246 

ORFQ241 1 serine protease f j . 

QRF02412 partitioning protein, ParB family __ 

ORFQ2413 chromosomal replication initiator protein DnaA (dnaA) _ 

ORF02415 DNA polymerase III, beta subunit (dnaN) 

ORF02417 conserved hypothetical protein 

ORF02419 conserved hypothetical GTP-binding protein 

ORF0242Q peptidyi-tRNA hydrolase (pth) 

QRF02421 transcription-repair coupling factor (mfd) 

QRF02423 S4 domain protein _____ 

ORF02424 cell division protein DivIC, putative 

ORFQ2426 expressed protein of unknown function , 

ORF02427 MesJ/Ycf62 family protein 

ORF02429 cell division protein FtsH (ftsH) 
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ORFQQ017 phosphoribosylaminoimidazolecarboxamide formyltransf erase/I MP cyclohydrolase (purH) 
ORF00025 conserved hypothetical protein 

ORF00029 acetyl xylan esterase, putative — — 

ORF00042 aldehyde-alcohol dehydrogenase (adhE) 
ORF00044 threonine synthase (thrC) 
ORF00081 ribosornal protein L17 (rplQ) 

ORF0Q09Q conserved hypothetical protein "~ 
ORF00129 argininosuccinate synthase (argG) 

ORF00156 oligopeptide ABC transporter, substrate-binding protein, putative 
QRF001 89 protease, putative 

ORFQ01 94 thioredoxin family protein """ 
ORF00195 tRNA binding domain protein 

ORFQQ21 7 conserved domain protein 

ORF00218 PTS system, HB component, putative 
QRF00220 transketolase, N-terminal subunit 
ORF0Q221 transketolase, C-terminal subunit 
ORF00223 oxidoreductase, putative 
ORF00282 acetyltransferase, GNAT family 
ORF0029Q IS1381, transposase OrfB 
ORF00291 IS 1381 , transposase OrfA 
ORF00293 conserved hypothetical protein 

ORF0Q301 membrane protein, putative ~ 
ORF00343 ABC transporter, permease protein, putative — 
ORF0Q344 conserved hypothetical protein - — — — 

ORF00382 aspartate kinase family protein 
QRF00399 conserved hypothetical protein 
ORF0Q439 cell wall surface anchor family protein 

ORF00447 cytidine/deoxycytidylate deaminase family protein 

QRF0Q450 5-formyitetrahydrofolate cyclo-ligase family protein — •— — 

ORF00480 transcriptional regulator, MerR family 

ORF00499 acetyltransferase, GNAT family 

QRF00504 magnesium transporter, CorA family — 
QRF00521 VanZF domain protein 
QRF0061 2 1S1 381 , transposase OrfA 
QRF00613 1S1 381, transposase OrfB 

ORF00690 transmembrane protein Vexpl (vexl ) 

ORF00691 ABC transporter, ATP-binding protein Vexp2 (vex2) 

ORF00692 transmembrane protein Vexp3 (vex3) — — 

ORF00714 conserved hypothetical protein 

ORF00732 expressed cell wall surface anchor family protein, putative 

ORF00774 ABC transporter, ATP-binding protein " 
ORF00778 ABC transporter, ATP-binding protein 
ORF00780 conserved hypothetical protein 
ORF00790 beta-glucuronidase 
ORF00800 alpha amylase family protein 

ORF00807 amino acid ABC transporter, permease protein 

ORF008Q9 amino acid ABC transporter, amino acid-binding protein 
ORF00814 conserved hypothetical protein 
ORF00823 bacterial luciferase family protein 
ORFQ0840 riboflavin biosynthesis protein RibD (ribP) 
ORFQ0841 riboflavin synthase, alpha subunit (ribE) 

ORF00842 riboflavin biosynthesis protein RibA (ribA) ~~ 

ORF0Q843 riboflavin synthase, beta subunit (ribH) 

ORF00866 penicillin-binding protein 2b 
ORF00905 membrane protein, putative 
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ORFxxxxx Annotation 

ORF0091 0 major facilitator family protein 

ORF00913 hydrolase, haloacid dehalogenase-like family 

ORF00918 conserved hypothetical protein 

ORF00945 conserved hypothetical protein 

ORF00948 ABC transporter, ATP-binding protein 

ORF00952 phosphomethylpyrimidine kinase (thiD) 

QRF00953 hydroxyethylthiazole kinase (thiM) 

ORF00954 thiamine-phosphate pyrophosphorylase (thiE) 

ORF00961 GtrA family protein 

|ORF00967 1 ,4-aipha-glucan branching enzyme (glgB) 

1ORF00968 glucose- 1 -phosphate adenylyltransferase (glgC) 
ORF00971 glycogen synthase (glgA) 
ORF00985 acetyltransferase, GNAT family 
QRF0Q990 magnesium transporter, CorA family, putative 
1ORF01022 nucleoside diphosphate kinase (ndk) 
ORF01031 nucleoside diphosphate kinase domain protein 
ORF01085 conserved hypothetical protein 
ORF01087 IS1 381, transposase OrfA 
ORF01088 1S1381, transposase OrfB 
ORF01098 ABC transporter, permease protein, putative 
ORF01 100 sensor histidine kinase 
IQRF01 102 ABC transporter, substrate-binding protein 
IORF01 127 protease, putative 

loRFO1 135 iron compound ABC transporter, permease protein 
|ORF01 136 iron compound ABC transporter, permease protein 
| ORF01 1 85 aspartate-semialdehyde dehydrogenase (asd) 

ORF01217 conserved hypothetical protein 

ORF01218 conserved hypothetical protein 
1ORF01219 formate/nitrite transporter family protein 

ORF01226 oxidoreductase, short chain dehydrogenase/reductase family, FRAMESHIFT 

ORF01254 homoserine kinase (thrB) 

QRF01255 homoserine dehydrogenase (horn) 

ORF01264 transcriptional regulator, Cro/Cl family 

ORF01268 thiol peroxidase (psaD) 

ORF01305 glycosyltransferase CpsJ(V) (cpsJ) 

ORF01306 glycosyltransferase CpsQ(V) (cpsO) 

ORF01 31 3 CpsD protein (cpsD) 

ORF01314 cpsC protein (cpsC) 

ORF01315 capsular polysaccharide biosynthesis protein CpsB (cpsB) 
ORF01316 capsular polysaccharide biosynthesis protein CpsA (cpsA) 
ORF01326 conserved hypothetical protein 
ORF01333 alpha-acetolactate decarboxylase (budA) 
ORF01334 acetolactate synthase, catabolic (ilvK) 
ORF01337 MutT/nudix family protein 
QRF01369 MATE efflux family protein 
ORF01398 Tn5252 t Orf 9 protein 
ORF01399 Tn5252, Orf 10 protein 
1ORF01446 protease, putative 
(QRF01447 conserved hypothetical protein 
IORF01449 conserved hypothetical protein 
IQRF01492 NAPP-specific glutamate dehydrogenase (gdhA) 
ORF01569 expressed cell wall surface anchor family protein 
ORF01570 cell wall surface anchor family protein 
IORF01574 polysaccharide biosynthesis protein 
1ORF01579 nucleotidyl transferase, putative 
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